Using Cooperative Learning To Teach Chemistry ... - ACS Publications

Dec 14, 2015 - Given the wide use of CL strategies across the board, several researchers reviewed its global effect across all educational settings(10...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/jchemeduc

Using Cooperative Learning To Teach Chemistry: A Meta-analytic Review Abdi-Rizak M. Warfa* Department of Natural Sciences, Metropolitan State University, St. Paul, Minnesota 55106, United States S Supporting Information *

ABSTRACT: A meta-analysis of recent quantitative studies that examine the effects of cooperative learning (CL) on achievement outcomes in chemistry is presented. Findings from 25 chemical education studies involving 3985 participants (Ntreatment = 1845; Ncontrol = 2140) and published since 2001 show positive association between chemistry achievement and CL use (mean effect size = 0.68, Z = 5.04, p < 0.0001). Practically, an effect size of 0.68 suggests median student performance in a CL group would be at the 75th percentile when compared to that of a student in a traditional group performing at the 50th percentile. However, heterogeneity analysis indicated significant variability among the estimated effect sizes (Q = 248.51, df = 24, p < 0.0001), indicating other factors moderate the effects of CL. Analysis of class size, geographical location, and grade level as possible moderators suggested geographical location accounts for 27% of estimated 40% total heterogeneity (τ2 = 0.40, se = 0.13) and had significant influence (p = 0.003) on effect size estimates. For example, when only US-based studies are considered, the effect size decreases to 0.38, still an improvement of about 15 percentile points but nevertheless much smaller than the omnibus effect size. The findings of these analyses as well the study’s implications are discussed. KEYWORDS: Chemical Education Research, Collaborative/Cooperative Learning, General Public, First-Year Undergraduate/General, Second-Year-Undergraduate FEATURE: Chemical Education Research

S

ince the publication of Craig Bowen’s1 meta-analysis on the effects of cooperative learning (CL) on chemistry achievement and attitudes more than 15 years ago, there has been widespread adoption of various CL strategies by chemistry faculty and teachers.2−8 Consider, for instance, the data shown in Figure 1, which displays the number of publications citing cooperative learning in this Journal since 1985 in five-year increments. Over the course of the three decades shown in Figure 1, the number of studies invoking CL increased substantially from 25 in the first 5 years to approximately 200 in the last 5 years. This increase is driven, in part, by decades of pedagogical research that demonstrates socially mediated forms of learning result better outcomes in student achievement, attitudes, and persistence in chemistry.9 Given the wide use of CL strategies across the board, several researchers reviewed its global effect across all educational settings10−12 or more specifically in undergraduate science, technology, engineering, and mathematics (STEM) courses.13 Within the chemical sciences, only the Bowen article1 metaanalyzed empirical studies specific to high school and college chemistry. However, the Bowen article predates the period in which CL use in chemistry increased. Thus, considering the noted increase in CL use by chemistry faculty and teachers, a discipline-based re-examination of the recent literature has the © XXXX American Chemical Society and Division of Chemical Education, Inc.

potential to establish a more correct estimate of effect magnitude and what empirical studies demonstrate about CL use in chemistry. This paper therefore provides an up-to-date synthesis of recent studies that reported CL use in chemistry. This is warranted for various reasons. First, the synthesis of various research studies in a meta-analytic review provides transparent, objective, and replicative summary of research findings that could reveal information that could help drive curricular and instructional changes in chemistry education. Such information might not be apparent in the individual analyzed primary studies. Second, the integration of disperse research findings in a review article provides a larger picture of how interactive and socially mediated learning pedagogies may assist students in improving chemistry achievement. Third, while the Bowen article reviewed quantitative studies on cooperative learning in chemistry, the present work analyzes more articles, overcomes certain methodological shortcomings in the Bowen article (i.e., lack of aggregating dependent samples in the same study, more complete database searchers, etc.), and provides a more indepth analysis and delineation of the results based on moderator effects (e.g., class size and geographical location). The addition of moderator effects goes beyond just analyzing the effects of CL on

A

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

difference between treatment and control groups is expressed in terms of “effect size.” This common scale makes possible quantitative comparison of related studies by using the information provided in each study.1,17 Further details of how we computed this common “effect size” metric can be found in the following.



METHODS The focus of this study was to review quantitative studies in chemistry education that contrast CL use with traditional instruction. A systematic review of this literature involved two major steps before data analysis ensued: (i) establishing inclusion criteria and (ii) selecting studies. This process unfolded as follows. Inclusion Criteria

Following the recommendations of Lipsey and Wilson,17 the following criteria were established as the basis for including studies in the meta-analysis: (i) Study participants are identified as high school or college students. (ii) Cooperative learning was used in the context of a naturalistic setting (regularly scheduled chemistry class, lab, or a recitation session). This limits inclusion to studies conducted only in face-to-face settings. (iii) The study used experimental/quasiexperimental research design to contrast cooperative learning treatment with a control group. (iv) Reported outcome is a measure of chemistry achievement. (v) The study provides sufficient statistical information to enable data analysis. (vi) The study was conducted between 2001 and 2015 and reported in English.

Figure 1. Publication trends in studies citing cooperative learning in the Journal of Chemical Education since 1985. The results were obtained using the keyword “cooperative learning” (blue) or “cooperative learning and chemistry” (red) (retrieved in June 2015).

student achievement and looks at what other factors may influence CL use. The analysis, then, focused on a set of two questions: (i) What are the effects of CL on student achievement in chemistry in comparison to traditional instruction? (ii) How do the effects of CL vary when used in different educational levels (high school versus college), classrooms (small vs medium vs large), and geographical locations?

Study Selection



To select studies for analysis, four structured steps were followed based on Meta-Analysis Reporting Standards (MARS) guidelines:18 study identification, screening, eligibility, and inclusion. The identification step initially involved identifying articles published in practitioner journals. As noted by Towns and Kraft,9 chemistry education research is predominantly published in a small set of chemistry and science education journals including Journal of Chemical Education, Chemistry Education Research and Practice, Journal of Research in Science Teaching, Science Education, International Journal of Science Education, Journal of College Science Teaching, and the Journal of Science Education and Technology. Each of these journals was searched using a combination of the keywords “cooperative learning” and “chemistry.” To augment the searches in practitioner journals, search engines ERIC, Google, and Google Scholar as well as abstract service engines Dissertation Abstracts International and ProQuest were searched using the same keyword terms. A systematic screening often revealed most articles did not satisfy criterion iii (the study must contrast CL treatment with a control group), while the search terms returned large hits in each search, for example, over 800 hits in JCE in the selected date range. Modifying the search terms to include “control group” or “treatment group” limited the articles to a smaller number of relevant articles (e.g., 170 hits in the case of JCE). A random sample of the articles that did not include the search terms “control” or “treatment” group was analyzed to validate the assumption that such articles did not meet criterion iii. Each of the selected articles was further examined for eligibility and

DEFINING META-ANALYSIS AND COOPERATIVE LEARNING Before examining the literature on the effectiveness of CL on chemistry achievement, it is important to establish common definitions of both CL and meta-analysis. At its core, cooperative learning entails, as conceptualized by Johnson et al.,14 structured small group activities with five essential components: positive interdependence, face-to-face promotive interactions, individual accountability, interpersonal and small group skills, and group processing. These systematic structural properties distinguish CL from other forms of active learning pedagogies. Consistent with this conceptualization of CL, the working definition of CL used in this study requires that students in the analyzed studies work in small groups on structured learning tasks that adhere to the essential components described above regardless of the specific CL method used.14 For readers not familiar with CL and its varied methods, further details of its core elements and their underpinning theoretical frameworks can be found elsewhere.15,16 The mechanism that we decided to synthesize findings from studies reporting the effects of CL on chemistry achievement is meta-analysis.17 Broadly speaking, meta-analysis is a systematic review of existing literature through statistical procedures that aggregate and contrast findings from several related studies. As is done in this study, it can be used to aggregate estimates of the relationship between variables, expressing relevant outcomes in a common scale. For example, in this study, the standard mean B

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

inclusion. This exhaustive search identified a total of 25 articles for analysis. While this represents 10 more articles than reported in the Bowen article, this still represents a small sample. Two reasons account for the small sample size. First, there are only few quantitative studies that meet the stringent criteria required for meta-analytic reviews, for example, few studies use appropriate control and treatment samples, rendering comparison with other studies unattainable. Second, the narrower focus on cooperative learning as the unit of analyses limits the number of studies that could be included in the study. Consequently, a smaller number of studies was analyzed in this research.

Table 1. Equations for Cohen’s d and Hedge’s g for Computing Effect Sizes Cohen’s d

d=

g = J×d

X1 − X 2 SDpooled

where X̅ 1 and X̅ 2 are the sample mean scores of the treatment (CL) and control (traditional instruction) groups and SDpooled the pooled standard deviation of both group means

where d is Cohen’s d and

J=1−

3 4 × df − 1

where df = degrees of freedom (group size minus one)

Extracting Effect Sizes

An effect size reflects the magnitude of difference between two variables and therefore is a useful standard metric for making quantitative comparisons.1 One needs mean scores, standard deviations, and sample number or, alternatively, other statistical information such as t-test and F-test results to compute effectsize values.1 Thus, this information was extracted from each of the analyzed studies. Per criterion v of the inclusion guidelines, articles that did not provide sufficient data were excluded from the analysis in the study selection step.

differences in methods, number of participants, and sample characteristics across the analyzed studies, we used random effects (RE)21 model to calculate the mean effect size. This is justified since different studies have different numbers of participants under different settings, thereby complicating the average processing of the effect sizes. All calculations were done using Microsoft Excel and the R software for statistical computing.22

Meta-analysis Statistical Tests

Resolving Statistical Dependencies

As described elsewhere, meta-analysis relies on statistical methods to combine and draw conclusions from the results of independent studies that test the same hypothesis. The first step in this endeavor requires converting the different outcomes reported in the primary studies to a common metric. As described in the Data Analysis section, this is achieved by computing “effect sizes” across the analyzed studies. This common metric makes comparing and contrasting the 25 studies included in this review possible. It is also important to make sure that effects are independent from one another. As described in the accompanying Supporting Information, various statistical tests can be used to resolve statistical dependencies such as aggregating multiple outcomes from the same study into a single effect size (see Data Analysis section for further description). Lastly, variability of the effect sizes needs to be investigated with a heterogeneity analysis. The heterogeneity test examines the inconsistency of effect sizes across the analyzed studies and might shed light on what other sources account for the observed variability in the computed effect sizes. Thus, the three main statistical tests presented in this paper are (i) methods for computing effect sizes, (ii) methods for resolving statistical dependencies, and (iii) heterogeneity analysis. A fourth set of statistical analyses in the paper provide diagnostic features for examining the threat of publication bias on the research findings. Further descriptions of these tests are provided in the Data Analysis section and the accompanying Supporting Information.



Hedge’s g

When a study reported multiple outcomes for the same participants (e.g., Test 1, Test 2, etc.), an aggregate effect size was calculated as described in the Supporting Information (see S1, Materials and Methods). Briefly, a separate effect size was computed for all within-study outcomes and subsequently aggregated into a single summary g-value if outcomes were determined to be equivalent.23,24 Determination of statistical dependency hinged on whether different subsamples were used or multiple outcomes were reported for the same participants.17 For example, Chase et al.25 reported achievement outcomes for General College Chemistry population (n = 241) and organic chemistry sample (n = 182) in their study. Because these subsample populations were regarded as statistically independent, their computed g-values were retained as separate and contribute independently to the mean g-value. Table S1 (Supporting Information) shows 67 effect sizes extracted from the analyzed studies25−49 and 25 resultant effect sizes after resolving statistical dependencies. This procedure allowed us to avoid potentially inflating the summary effect size but also not combining statistically unequal samples. Heterogeneity Analysis

To detect variability in effect size estimates, we used Q-statistical analysis.50 Specifically, two forms of Q-statistics, QE and QM, are reported. QE measures the amount of heterogeneity in a set of effect sizes. A statistically significant QE suggests significant variations due to potential moderators (e.g., class size) that can be further examined. A nonsignificant QE results rejection of the null hypothesis for homogeneity,50 which makes further grouping of the data unnecessary since the outcome would suggest all effect sizes are estimating the same population mean. QM, on the other hand, describes the amount of heterogeneity that can be explained by moderators. If significant, the result would suggest the analyzed moderator has significant effect on the measured outcome. If there are no moderators, QM = 0 and QE simply becomes Q. More importantly, when the number of analyzed studies is small (k < 10), the Q-statistic is not considered to be sensitive enough to detect heterogeneity, and therefore its interpretation warrants caution under those circumstances.

DATA ANALYSIS

Computing Effect Sizes (g) across Studies

To integrate findings across the reviewed studies, Hedge’s weighted standardized mean difference (Hedge’s g) on achievement outcomes served as the metric of choice for expressing effect sizes.19 Hedge’s g weights both treatment and control group’s standard deviations by their size (n) and removes small positive sample bias that has been shown to affect calculations in Cohen’s d, the metric most commonly used to express standardized mean differences.19,20 Table 1 shows the equations needed to compute Cohen’s d and Hedge’s g and the relationship between the two. Because there are likely true unmeasured C

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Table 2. Main Effects of CL on Chemistry Achievement by Class Type and Geographical Locationa 95% Confidence Interval Parameter Class kb All Small (≤50, k = 15) Medium (51−100, k = 5) Large (≥50, k = 5) Geographic Location US (k = 14) Non-US (k = 11) Middle East (k = 9) a

Q-Values (df)

Mean Effect Size, g

SE

Lower Limit

Upper Limit

248.51c (24) 109.06c (14) 18.25d (4) 68.12 (4)

0.68c 0.89c 0.18 0.64e

0.136 0.185 0.167 0.284

0.419 0.524 −0.153 0.084

0.951 1.249 0.505 1.197

120.31c (13) 69.24c (8) 24.859d (8)

0.38d 1.10c 1.35c

0.141 0.205 0.159

0.099 0.703 1.040

0.652 1.505 1.662

Note: The trim-fill method26 was used to analyze these data. bk = Number of studies. cp < 0.001. dp < 0.01. ep < 0.05.

Figure 2. Forest plot displaying weighted effects and heterogeneity estimates of the meta-analyzed studies. In this plot, each effect size (square points) along with its confidence interval is displayed visually in the center for all the studies. The size of the square points reflects the precision of the effect size estimates (larger studies have larger points). The diamond at the bottom represents the combined summary effect. aTwo different g-values were obtained in this study as its subsample populations were determined to be statistically independent.

Publication Bias

to control for publication bias, as was done in this study, is to use Duval and Tweedie’s nonparametric trim and fill method when computing effect size.52

One fundamental threat to meta-analytic reviews arises from publication bias: the idea that studies with negative or nonsignificant effects are unlikely to be published51 and therefore bias the meta-analysis findings. Diagnostic methods for detecting publication bias, albeit circumstantially, include analysis of funnel plot (a simple scatter plot of observed outcome vs standard errors of individual studies) symmetry and calculating fail-safe N studies to determine how many missing studies would have to be published to alter the summary g-value.51 Unfortunately, these diagnostics do not assign causal mechanisms for suspected biases but provide only information about relationships between study’s outcome and, for example, its standard errors. One way



RESULTS

Main Effects of CL on Chemistry Achievement

Table 2 summarizes the main effects of CL on student achievement in chemistry as reported in the literature since 2001. The reviewed research involved a total of 3985 high school and college student participants. Of 25 studies that were initially included in the analysis after resolving statistical dependencies (see Table S1), one study45 had a g-value that was markedly larger than the rest of the data (g+ ≈ 3) and was subsequently D

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 3. Funnel plot (left) indicating some asymmetry and Q−Q-normal plot (right) of sample quantiles (y-axis) versus theoretical quantiles (x-axis). In panel A, the sides of the nonshaded triangle are determined based on 95% confidence intervals centered about the average Hedge’s g-value. In panel B, the data points (closed circles) represent the studies analyzed, which if the data fits a normal distribution will lie on a straight line with an intercept equal to the mean Hedge’s g-value and a slope equal to the standard deviation. The dashed lines represent a pseudoconfidence envelope based on quantiles of simulated set of pseudoresiduals. The twists of the points in the plot suggest non-normal distribution of the analyzed data.

Moderator Effects

excluded from further analysis as an outlier based on sensitivity analysis.53 Its removal, while slightly lowering the summary effect size, did not negatively affect the computed g-value or its direction. Thus, all the results reported heretofore are based on the 24 studies depicted in Figure 2, which shows their weighted effect sizes and heterogeneity estimates. As can be seen in Table 2, the mean weighted effect size (g) on achievement in the analyzed studies was positive and significant (g+ = 0.68, SE = 0.136, p < 0.001). Practically, this suggests median student performance in a CL group would be 25 percentile points higher than that of a median student in a traditional group performing at the 50th percentile. Therefore, on average, the students in the CL condition significantly outperformed their counterparts in traditional classes. However, on the basis of measures of the degree of variability in the data (Q-statistic), there is significant heterogeneity in the findings that warrants further examination.

For purposes of geographical location, 14 studies were coded as US-based, while 11 were considered non-US-based (Table 2). Nine (k = 9) of the non-US-based studies came from the Middle East, while the remaining two were from Africa and Europe, respectively. Our analysis suggests that geographical location influences significantly the average effectiveness of CL use (QM = 8.97, df = 1, p = 0.003) and is significantly associated with variations in effect sizes (QE = 225.94, df = 23, p = 0.00). The mean effect size in non-US-based studies was almost three-times that of US-based studies (g of 1.10 vs g of 0.38, Table 2). The higher mean effect size in non-US-based studies is mainly accounted for by studies out of Turkey (k = 8), which tended to report higher effect sizes than any other location (g+ of 1.35 vs g of 0.11−0.38 for all other locations). Taken together, the evidence seems to suggest that CL effect on chemistry achievement varies by geographical location. Comparison of effect sizes by grade level indicated studies involving college students had a lower effect size (g+ = 0.49, k = 16) than those involving high school students (g+ = 1.03, k = 9). However, these data are most likely skewed as six of the nine studies reporting high school data were from Turkish studies, which, as noted above, tended to report higher effect sizes across the board. More importantly, overall analysis of grade level as moderator suggests the use of CL with high school or college students does not actually influence its average effectiveness (QM = 3.90, df = 1, p > 0.05). However, it is likely other moderators affect its use as the test for heterogeneity for grade level was significant (QE = 238.98, df = 23, p < 0.001). Similar to grade level, meta-analysis of the data by class size suggests it does not actually influence the average effectiveness of CL (QM = 4.60, df = 2, p = 0.10). However, there was significant heterogeneity in effect size estimates (QE = 195.43, df = 22, p = 0.00). Treatment classes were considered small if there were 50 students or fewer; medium if the number was between 51 and 100; and large if there were more than 100 student enrollees. Although there was higher effect sizes in small classes than larger

Analyzing Variances in g-Values

As noted above, although student performance was higher under CL conditions, heterogeneity analysis based on Q-statistic indicated variability among observed effect sizes greater than what sampling error or chance would suggest (Q = 248.51, df = 24, p < 0.0001, Table 2). That is, while the mean g-values and confidence intervals across all studies show benefits of CL over traditional instruction, there are other modifying factors that alter its effect. Kyndt et al.10 previously examined four variables as possible moderators of CL effects: (i) use of different CL approaches (e.g., jigsaw, STADS, constructive controversy, etc.); (ii) age; (iii) grade level, and (iv) culture. The authors did not find evidence to indicate variations in effect sizes due to methods of CL use but found age, grade level, and culture (Western vs non-Western) as significant moderators of CL effects. By building on these findings,10 we examined the moderating effects of grade level, geographic location, and class size on chemistry achievement as described below. E

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

on standardized exams.13 In their meta-analytic review of the effects of active learning in STEM courses, Freeman et al.13 noted an effect size of 0.47 results an increase of about 6% in average student examination scores, roughly translating into a 0.3 points in an average final grade when using 0−100 point grading scale. Our analysis concurs with this earlier finding, indicating an average effect size of 0.68 would improve student performance by about 8.7%. A change of this magnitude in a final course grade would make the difference between a grade of C− and C, usually the cutoff point for passing chemistry courses with a letter-based grading system. Findings in this meta-analysis indicate geographical locations had a significant association with effect size variations. Studies out of Turkey reported a much higher mean effect size than those of US-based studies. It is quite hard to speculate on ways the cultural influences of the Turkish studies ultimately influence CL effectiveness; it may be as simple as the tendency to use cooperative learning more readily in Turkey because of how their educational reform efforts are promulgated or that traditional teaching methods against which the CL is compared in Turkey are particularly hindering for student learning. Further studies will be needed to ascertain the causes of the higher gains in the Turkish studies versus others reported in this paper. While the results reported in this paper indicate smaller classes would benefit the most from incorporating CL environments, the effect size was significantly positive across all class types, small or large. Cooper54 has previously described the benefits and pitfalls of using cooperative learning in large enrolment chemistry courses. The findings reported in this paper seem to support empirically the benefits of CL in large enrollment chemistry courses: the extracted effect size among the five studies that reported using CL in large classes was 0.64. However, given the small number of studies involved in large classes, further research is needed to ascertain this claim. In summary, the meta-analytic results show CL had a moderate and significant treatment effect on student achievement in chemistry. This effect was moderated by geographical location, although all analyzed moderators favored CL over traditional instruction. The summary effect of 0.68 is greater than the 0.37 effect size reported by Bowen.1 However, the positive association of CL use for teaching chemistry in both studies potentially indicates the robustness of the results overtime.

or medium classes (g+ = 0.89, SE = 0.19, p < 0.001), these data are most likely skewed since most of the analyzed research was conducted in a small class settings (k = 15). Only five studies met each of the medium and large class descriptions (Table 2). Thus, the data involving class size should be interpreted with caution and considered only preliminary: the test of homogeneity (Qstatistic) is not sensitive enough to detect heterogeneity when the number of studies is less than ten.52 It is worth noting, however, that all effect sizes are positive and favor treatment over control under all class types. Publication Bias

Visual inspection of the study’s funnel plot (Figure 3, left) indicates some asymmetry in which a good number of smaller studies cluster to the right of the mean. Egger’s regression test21 seems to bear this out by returning a significant p-value (z = 2.169, p = 0.030). The rank correlation test for funnel plot asymmetry21 yielded nonsignificant p-value (Kendell’s tau = 0.273, p > 0.05), although this could be due to the low power of this test21 and therefore does not contradict the Egger test. Taken together, the evidence seems to suggest smaller studies were more likely to report a higher effectiveness of CL than did the larger studies. The noted higher effect of smaller studies may be indicative of publication bias or “small-study” bias that reflects sources of heterogeneity in the analyzed data. As can be seen in the quantile−quantile normal plot in Figure 3 (right), there is nonnormal distribution of effect estimates that could be the cause of the observed asymmetry in the funnel plot. Moreover, trim-fill analysis52 for adjusting publication bias indicates the observed asymmetry has no effect on the estimates of computed mean effect sizes. The random effect estimates for the standardized mean difference under the trim-fill method remained at 0.68 (95% CI, 0.34−0.83). This finding is strengthened by fail-safe N calculations using the Rosenberg approach,21 which suggests that there would have to be 856 studies with an effect size of 0 added to the analysis for the summary effect to become trivial. Similarly, the more conservative Orwin21 fail-safe N approach suggested N of 23 studies to be published for the observed effect to become nonsignificant. Taken together, this body of evidence suggests that whatever publication bias that may exist is not likely to alter the overall conclusion drawn from these studies and that the addition of more studies covering the same sampling domain would unlikely change the results.





CONCLUSION

IMPLICATIONS, FUTURE RESEARCH, AND STUDY LIMITATIONS

Implications

Science is a cumulative enterprise in which findings from previous studies must be examined in light of new findings. One mechanism in which findings from the chemical education literature can be grounded is to carry out a systematic review of the literature that reports certain benefits of one pedagogical method over others. In the present study, 25 chemistry education studies that met stringent criteria for inclusion were metaanalyzed to examine the hypothesis that “cooperative learning improves student performance in chemistry classes”. A previous meta-analysis in this literature1 predates a period in which chemistry faculty and teachers the world over embraced the use of cooperative learning as part of their instructional toolkit. This report found CL increased student achievement outcomes by 0.68 standard deviations, suggesting a student in a CL group would perform 25 percentile points better than a student in a traditional group performing at the 50th percentile. This has direct consequences for course GPA and performance

On the basis of findings in this meta-analysis, the use of cooperative learning as important pedagogical tool for teaching chemistry at all educational levels and settings is highly recommended. Broadly speaking, there was positive association between CL use and chemistry achievement in the metaanalyzed studies. Given its overall effectiveness, what is needed is further research that delineates the specific aspects of CL that contribute to the observed outcomes and the variables that moderate its impact. Here, we examined the influences of geographical location, grade level, and class size on the average effectiveness of CL. However, beyond these specific variables, further research can uncover the specific aspects of CL activities that mediate their perceived impact. This can be achieved through qualitative analysis of students and teachers engaged in CL activities and rich descriptions of the nature of discourse that students engage during CL activities.55−57 F

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Limitations

(7) Balfakih, N. The effectiveness of Student-Team Achievement Division (STAD) for teaching high school chemistry in the United Arab Emirates. Int. J. Sci. Educ. 2003, 25, 605−624. (8) Pratt, S. Cooperative learning strategies. Sci. Teach. 2003, 70, 25. (9) Towns, M.; Kraft, A. Review and Synthesis of Research in Chemical Education from 2000−2010. Paper presented at the Second Committee Meeting on the Status, Contributions, and Future Directions of Discipline-Based Education Research, 2011, National Academies. http://sites.nationalacademies.org/cs/groups/dbassesite/documents/ webpage/dbasse_072594.pdf (accessed November 2015). (10) Kyndt, E.; Raes, E.; Lismont, B.; Timmers, F.; Cascallar, R.; Dochy, F. A meta-analysis of the effects of face-to-face cooperative learning. Do recent studies falsify or verify earlier findings? Educ. Res. Rev. 2013, 10, 133−149. (11) Qin, Z.; Johnson, D. W.; Johnson, R. T. Cooperative versus competitive efforts and problem solving. Rev. Educ. Res. 1995, 65, 129− 143. (12) Puzio, K.; Colby, G. T. Cooperative learning and literacy: A metaanalytic review. J. Res. Ed. Effect. 2013, 6, 339−360. (13) Freeman, S.; Eddy, S. L.; McDonough, M.; Smith, M. K.; Okoroafor, N.; Jordt, H.; Wenderoth, M. P. Active learning increases student performance in science, engineering, and mathematics. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 8410−8415. (14) Johnson, D. W.; Johnson, R. T.; Smith, K. Active Learning: Cooperation in the College Classroom; Interaction Book Company: Edina, MN, 1998. (15) Eilks, I.; Markic, S.; Baumer, M. Cooperative learning in higher level chemistry education. In Innovative Methods of Teaching and Learning Chemistry in Higher Education; Eilks, I., Bayers, B., Eds.; RSC Publishing: London, 2009; pp 103−122. (16) Millis, B. J. Cooperative Learning in Higher Education: Across the Disciplines, Across the Academy; eBook, Stylus: Sterling, VA, 2010. (17) Lipsey, M. W.; Wilson, D. B. Practical Meta-Analysis; Sage Publications: Thousand Oaks, CA, 2001. (18) Meta-Analysis Reporting Standards (MARS). APA Publications, 2008. http://www.apastyle.org/manual/related/JARS-MARS.pdf (accessed June 2015). (19) Hedges, L. V. An Unbiased Correction for Sampling Error in Validity Generalization Studies. J. Appl. Psych. 1989, 74, 469−477. (20) Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Lawrence Erlbaum: Hillsdale, NJ, 1988. (21) Borenstein, M.; Hedges, L. V.; Higgins, J. P. T.; Rothstein, H. R. Introduction to Meta-Analysis; John Wiley: Chichester, UK, 2009; pp 61− 85. (22) The R Project for Statistical Computing. http://www.R-project.org/ (accessed Nov 2015). (23) Borenstein, M. Effect sizes for continuous data. In The Handbook of Systematic Review and Meta-Analysis; Cooper, H., Hedges, L. V., Valentine, J. C., Eds.; Russell Sage Foundation: New York, 2009; pp 221−235. (24) Gleser, L. J.; Olkin, I. Stochastically dependent effect sizes. In The Handbook of Systematic Review and Meta-Analysis; Cooper, H., Hedges, L. V., Valentine, J. C., Eds.; Russell Sage Foundation: New York, 2009; pp 357−376. (25) Chase, A.; Pakhira, D.; Stains, M. Implementing ProcessOriented, Guided-Inquiry Learning for the First Time: Adaptations and Short-Term Impacts on Students’ Attitude and Performance. J. Chem. Educ. 2013, 90, 405−416. (26) Adesoji, F. A.; Ibraheem, T. L. Effects of student teamsachievement division strategy and mathematics knowledge on learning outcomes in chemical kinetics. J. Int. Soc. Res. 2009, 2, 15−25. (27) Anderson, W. L.; Mitchell, S. M.; Osgood, M. P. Comparison of student performance in cooperative learning and traditional lecturebased biochemistry course. Biochem. Mol. Biol. Educ. 2005, 33, 387−393. (28) Barthlow, M. The Effectiveness of Process Oriented Guided Inquiry Learning To Reduce Alternate Conceptions in Secondary Chemistry. PhD Thesis, Liberty University, Lynchburg, VA, 2011.

Meta-analytic reviews lend themselves to certain limitations. For example, heterogeneity of the included studies could be problematic. The analyzed studies often use different sample populations, treatment and control, etc. While this is unavoidable outcome in synthesis research,21 certain guidelines were utilized in this study to safeguard against the heterogeneity threat. For one, all meta-analyzed studies needed to meet a stringent preset criteria for admission. Also, moderators such as class size, geographical location, and educational levels were used to analyze heterogeneity in the studies. While publication bias can be a serious threat to the overall finding, the use of fail-safe N analysis and funnel plot diagnostics did not provide evidentiary support or lack thereof for publication bias. However, these diagnoses are theoretical in nature and, as noted by Duval and Tweedie,52 might not match reality. One other limitation worth mentioning is the fact that this study singled out cooperative learning as the active learning pedagogy used in the analyzed articles. This sampling choice eliminated multiple other forms of active learning methods that can have similar effect on student learning in chemistry. Nevertheless, there exist multiple recent meta-analytic reviews that analyzed the effects of active learning on student understanding in STEM courses that fill this void.13



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available on the ACS Publications website at DOI: 10.1021/acs.jchemed.5b00608. Further commentary on effect size calculations and publication bias including a table with effect sizes from all the analyzed studies (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We thank the authors who provided us data from their papers upon request. The article was strengthened by feedback provided to the author by anonymous reviewers. Their commentary and indispensable feedback were greatly appreciated.



REFERENCES

(1) Bowen, C. A quantitative literature review of cooperative learning effects on high school and college chemistry achievement. J. Chem. Educ. 2000, 77, 116−119. (2) Carpenter, S.; McMillan, T. Incorporation of a cooperative learning technique in organic chemistry. J. Chem. Educ. 2003, 80, 330−334. (3) Hinde, R. J.; Kovac, J. Student active learning methods in physical chemistry. J. Chem. Educ. 2001, 78, 93−97. (4) Mahalingam, M.; Schaefer, F.; Morlino, E. Promoting student learning through group problem solving in general chemistry recitations. J. Chem. Educ. 2008, 85, 1577−1581. (5) Cox, C. T., Jr. Incorporating more individual accountability in group activities in general chemistry. J. Coll. Sci. Teach. 2015, 044, 30− 36. (6) Ingo, E.; Gabriele, L. A Jigsaw classroom illustrated by the teaching of atomic structure. Sci. Educ. Int. 2001, 12, 15−20. G

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

(29) Bilgin, I.; Geban, O. The effect of cooperative learning approach based on conceptual change condition on students’ understanding of chemical equilibrium concepts. J. Sci. Educ. Technol. 2006, 15, 31−46. (30) Cetin, P. S.; Kaya, E.; Geban, O. Facilitating conceptual change in gases concepts. J. Sci. Educ. Technol. 2009, 18, 130−137. (31) Demircioglu, G.; Ayas, A.; Demircioglu, H. Conceptual change achieved through a new teaching program on acids and bases. Chem. Educ. Res. Pract. 2005, 6, 36−51. (32) Doymus, K. Effects of a cooperative learning strategy on teaching and learning phases of matter and one-component phase diagrams. J. Chem. Educ. 2007, 84, 1857−1860. (33) Doymus, K. Teaching chemical bonding through jigsaw cooperative learning. Res. Sci. & Technol. Educ. 2008, 26, 47−57. (34) Goeden, T. J.; Kurtz, M. J.; Quitadamo, I. J.; Thomas, C. Community-based inquiry in allied health biochemistry promotes equity by improving critical thinking for women and showing promise for increasing content gains for ethnic minority students. J. Chem. Educ. 2015, 92, 788−796. (35) Hein, S. Positive impacts using POGIL in organic chemistry. J. Chem. Educ. 2012, 89, 860−864. (36) Hemraj-Benny, T.; Beckford, I. Cooperative and inquiry-based learning utilizing art-related topics: Teaching chemistry to community college nonscience majors. J. Chem. Educ. 2014, 91, 1618−1622. (37) Taştan Kırık, O. T.; Boz, Y. Cooperative learning instruction for conceptual change in the concepts of chemical kinetics. Chem. Educ. Res. Pract. 2012, 13, 221−236. (38) Lyon, D. C.; Lagowski, J. J. Effectiveness of facilitating smallgroup learning in large lecture classes: A General chemistry case study. J. Chem. Educ. 2008, 85, 1571−1576. (39) Mohamed, A. R. Effects of active learning variants on student performance and learning perception. IJ-SOTL, 20082 (2), 1−14. (40) O’Dwyer, A.; Childs, P. Organic chemistry in action! What is the reaction? J. Chem. Educ. 2015, 92, 1159. (41) Shatila A. Assessing the Impact of Integrating POGIL in Elementary Organic Chemistry. PhD Thesis, University of Southern Mississippi, Hattiesburg, MS, 2007. (42) Tarhan, L.; Acar Sesen, B. Jigsaw cooperative learning: Acid−base theories. Chem. Educ. Res. Pract. 2012, 13, 307−313. (43) Williamson, V. M.; Rowe, M. W. Group problem-solving versus lecture in college-level quantitative analysis: The good, the bad, and the ugly. J. Chem. Educ. 2002, 79, 1131−1134. (44) Eaton, L. The Effect of Process-Oriented Guided-Inquiry Learning on Student Achievement in a One Semester General, Organic, and Biochemistry Course. Mathematical and Computing Sciences Masters Thesis, Paper 102, St. John Fisher College, Rochester, NY, 2006. (45) Acar, B.; Tarhan, L. Effects of cooperative learning on students’ understanding of metallic bonding. Res. Sci. Educ. 2008, 38, 401−420. (46) Oliver-Hoyo, M. T.; Allen, D. D.; Hunt, W. F.; Hutson, J.; Pitts, A. Effects of an active learning environment: Teaching innovations at a research 1 institution. J. Chem. Educ. 2004, 81, 441−448. (47) Shachar, H.; Fischer, S. Cooperative learning and the achievement of motivation and perceptions of students in 11th grade chemistry classes. Learn & Instruction 2004, 14, 69−87. (48) Bradley, A. Z.; Ulrich, S. M.; Jones, M., Jr.; Jones, S. M. Teaching the sophomore organic course without a lecture. Are you crazy? J. Chem. Educ. 2002, 79, 514−519. (49) Bilgin, I. Promoting Pre-service Elementary Students’ Understanding of Chemical Equilibrium through Discussions in Small Groups. Int. J. Sci. Math. Educ. 2006, 4, 467−484. (50) Huedo-Medina, T. B.; Sanchez-Meca, J.; Marin-Martinez, F.; Botella, J. Assessing heterogeneity in meta-analysis: Q statistic or I2 index? Psych. Methods 2006, 11, 193−2006. (51) Publication Bias in Meta-Analysis: Prevention, Assessment, and Adjustments; Rothstein, H. R., Sutton, A. J., Borenstein, M., Eds.; John Wiley: Chichester, UK, 2005. (52) Duval, S.; Tweedie, R. A nonparametric ‘‘trim and fill’’ method of accounting for publication bias in meta-analysis. J. Am. Stat. Assoc. 2000, 95, 89−98.

(53) Viechtbauer, W.; Cheung, M. W. L. Outlier and influence diagnostics for meta-analysis. Res. Syn. Meth. 2010, 1, 112−125. (54) Cooper, M. Cooperative Learning: An Approach for Large Enrollment Courses. J. Chem. Educ. 1995, 72, 162−164. (55) Warfa, A. M.; Roehrig, G.; Schneider, J.; Nyachwaya, J. The Role of teacher-initiated discourses in students’ development of representational fluency in chemistry − A case study. J. Chem. Educ. 2014, 91, 784− 792. (56) Warfa, A. M.; Roehrig, G.; Schneider, J.; Nyachwaya, J. Collaborative discourses and the modeling of solution chemistry − impact and characterization. Chem. Educ. Res. Pract. 2014, 15, 835−848. (57) Becker, N.; Rasmussen, C.; Sweeney, G.; Wawro, M.; Towns, M.; Cole, R. Reasoning using particulate nature of matter: An example of a sociochemical norm in a university-level physical chemistry class. Chem. Educ. Res. Pract. 2013, 14, 81−94.

H

DOI: 10.1021/acs.jchemed.5b00608 J. Chem. Educ. XXXX, XXX, XXX−XXX