What Prospective Chemistry Teachers Know about Chemistry: An

2 hours ago - Acquiring sufficient content knowledge to lead effectively in the classroom is one of the greatest challenges for beginning teachers. Na...
20 downloads 0 Views 2MB Size
Article Cite This: J. Chem. Educ. XXXX, XXX, XXX−XXX

pubs.acs.org/jchemeduc

What Prospective Chemistry Teachers Know about Chemistry: An Analysis of Praxis Chemistry Subject Assessment Category Performance Lisa Shah,† Jeremy Schneider,† Rebekah Fallin,⊥ Kimberly Linenberger Cortes,§ Herman E. Ray,⊥,∥ and Gregory T. Rushton*,†,‡ †

Department of Chemistry, ‡Institute for STEM Education, Stony Brook University, Stony Brook, New York 11794, United States Department of Statistics and Analytical Science, §Department of Chemistry and Biochemistry, ∥Analytics and Data Science Institute, Kennesaw State University, Kennesaw, Georgia 30144, United States

J. Chem. Educ. Downloaded from pubs.acs.org by UNIV OF SOUTH DAKOTA on 09/17/18. For personal use only.



S Supporting Information *

ABSTRACT: Acquiring sufficient content knowledge to lead effectively in the classroom is one of the greatest challenges for beginning teachers. National and state agencies have made significant investments in contentspecific induction supports, but these efforts have not been informed by any empirical evidence regarding the topic-level content knowledge of novice teachers. Here we analyze category-level data from the Praxis Chemistry Subject Assessment from May 2006 to June 2016 to determine the areas of general strength and weakness among examinees and explore differences in categorical performance by test-taker demographics. Examinees have generally performed well in the area of “Atomic and Nuclear Structure” and appear to have struggled most in the area of “Solutions and Solubility; Acid−Base Chemistry”. Across categories, estimates of academic preparation (e.g., undergraduate GPA, undergraduate major, and graduate major) have explained a large proportion of variance in examinee performance, although demographic characteristics such as gender and race or ethnicity were more explanatory in certain categories, such as “Atomic and Nuclear Structure”. Chemistry majors were the top performers in almost all categories, and education majors underperformed, often at the level of non-STEM majors, across all topics. The findings from this work should inform both professional development efforts for beginning teachers as well as instructional reform at the undergraduate level. KEYWORDS: Chemistry Education Research, High School/Introductory Chemistry, First-Year Undergraduate/General, Testing/Assessment, Professional Development FEATURE: Chemical Education Research



INTRODUCTION

quality mentorship, online and in-person workshops, and individual coaching.7−9 At the core of these strategies is an emphasis on discipline-specific (e.g., chemistry) support, structuring professional development (PD) around curricular topics in each content area. These efforts, however, often adopt a one-size-fits all approach to PD, such that teachers of a particular subject are offered the same workshops and seminars, irrespective of differences in disciplinary backgrounds or professional experience.

New teachers are a vulnerable but important population of educators.1−3 Those just entering the teaching profession are often charged with maintaining general classroom responsibilities while simultaneously reviewing or learning disciplinary content, a combination that can be overwhelming.4,5 Although the experience gained in early years can significantly improve practice in subsequent years, the learning curve for many novice teachers can be steep enough to adversely impact student achievement.6 To ensure that all students have access to highly effective STEM educators, national and state agencies have made substantial investments in STEM-teacher-preparation and -induction programs in the past two decades, which focus on providing aspiring and beginning teachers with the resources needed for successful integration into the profession. Efforts by initiatives such as The Robert Noyce Teacher Scholarship Program (Noyce), the UTeach Institute, and the Physics Teacher Education Coalition (PhysTEC) have included high© XXXX American Chemical Society and Division of Chemical Education, Inc.



THEORETICAL FRAMEWORK

Content knowledge has long been recognized as the cornerstone of effective teaching. Shulman10 argued that robust content knowledge (CK) was a required foundation upon which strong Received: May 16, 2018 Revised: August 30, 2018

A

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

To know, for example, the specific content areas in which prospective teachers of various demographic backgrounds demonstrate strength or weakness would allow reform efforts (especially those targeting at-risk populations) to adopt a more individualized approach to improving CK of both pre- and inservice educators.

pedagogical content knowledge (PCK) could then be developed, but that CK remained absent from the field of education research at the time of his work.11 This call for greater emphasis on teacher content knowledge seems to have been answered, as it has more recently become a focal point for educational researchers worldwide. Researchers have investigated the links between CK and teaching practice,12,13 CK and cognitive processes,14 and CK and student achievement.15 Results of these studies have indicated that teachers with stronger CK, often because they have pursued an undergraduate major or minor or possibly a graduate degree in their discipline, are more likely to be effective practitioners. Proposed explanations for these results have included greater knowledge of student misconceptions,15 more sophisticated lesson planning and structuring,16 and improved ability to verbalize and model concepts.17 However, studies examining the content backgrounds of our nation’s chemistry teachers have uncovered the prevalence of out-of-field teaching (i.e., the assignment of a teacher to a subject for which they do not have adequate content expertise) in the discipline. Our own work in this area has revealed that less than 35% of all chemistry teachers reported having undergraduate majors or minors in chemistry in 2007 and 2008.18 This number has more recently risen, although significant differences in this percentage between “high-need” (40% with majors or minors) versus “non-high-need” (50% with majors or minors) schools are apparent.19 Given that the majority of K−12 chemistry educators are reportedly teaching the subject without having pursued a chemistry major/minor, it may be worthwhile for teacher preparation and retention initiatives to uniquely concentrate efforts where teachers are likely to need the greatest content-specific support. However, studies have not investigated the individual differences in knowledge of chemical concepts of prospective chemistry teachers to purposefully guide such efforts. Furthermore, research has documented apparent differences in the CK of prospective educators with varying demographic characteristics. Analysis of the content knowledge of aspiring teachers offers a snapshot of what subject-level expertise novice educators bring with them on day 1 in the classroom and can provide the data needed to guide early interventions. Our previous study of the Praxis Chemistry Subject Assessment, an exam designed to measure the incoming content knowledge of aspiring teachers, revealed substantial differences in overall exam performance among candidates across demographic characteristics in every year between 2006 and 2016.20 Of note, males outperformed female examinees in each year studied, even after controlling for both GPA and undergraduate major. Additionally, White and Asian test-takers outperformed Black and Hispanic examinees by as much as 25 scaled points during this 10 year period. Chemistry majors, encouragingly, outperformed all other majors on this exam. However, the performance of nonmajors cannot be ignored, given that, for example, biology and education majors comprise approximately 35% and 15% of chemistry educators nationwide, respectively.18 Differences in CK as a function of these demographic traits likely require that those involved in teacher preparation make intentional strides toward offering high-quality, inclusive chemistry education for all prospective chemistry teachers and provide early PD opportunities for novice educators. However, research has not yet documented how relevant faculty, administrators, and/or PD facilitators might tailor their efforts for specific subpopulations of the chemistry teacher workforce.



RATIONALE AND RESEARCH QUESTIONS The Praxis Chemistry Subject Assessment is the most nationally representative evaluation of the subject-matter knowledge of prospective chemistry teachers in the United States.21 In the past decade, the exam has played a major role in the chemistryteacher-certification process in 38 U.S. states and Washington, DC, assessing prospective teachers’ knowledge in seven different categories: • Basic Principles of Matter and Energy; Thermodynamics • Atomic and Nuclear Structure • Nomenclature, Chemical Composition; Bonding and Structure • Chemical Reactions; Periodicity • Solutions and Solubility; Acid−Base Chemistry • Scientific Inquiry and Social Perspectives of Science • Scientific Procedures and Techniques The purpose of this study was to empirically report the categorical content knowledge of aspiring chemistry teachers and compare topic-specific performances of examinees across demographic characteristics. Using the entire population of Praxis Chemistry Subject Assessment examinees from June 2006 to May 2016, the following research questions were investigated: (1) How have examinees performed as a whole in each category on the Praxis Chemistry Subject Assessment between 2006 and 2016? (2) Which demographic characteristics have been associated with examinee performance in each category between 2006 and 2016? What have been the relative categorylevel performances of examinees of varying characteristics?



ANALYTICAL METHODOLOGY

Praxis Chemistry Subject Assessment

The Praxis Subject Assessments20 are administered by the Educational Testing Service (ETS) in a number of subjects (e.g., chemistry, physics, and biology) and are intended to evaluate the subject-matter knowledge of prospective teaching candidates. As has been reported previously,21−23 questions are designed by experts in each field and subsequently reviewed by ETS before administration. Study Population. The analyses presented in this study are performed on all Praxis Chemistry Subject Assessment examinees between June 2006 and May 2016, the large majority of whom likely represent prospective chemistry educators. Each test-taker is assigned a unique identification number, and all demographic information (e.g., gender, undergraduate major, race or ethnicity, and undergraduate GPA) is extracted from selfreported responses to a questionnaire following the assessment. A detailed list of extracted demographic information is provided in Table S1. Repeat examinees were treated such that only the highest score earned was included as part of the analytical population. Further, pairwise deletion was used to treat missing responses so as to make use of all available data. The final data set comprised a total of 17,527 individuals who had taken the B

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 1. Praxis Chemistry Subject Assessment category performance, derived from data provided by the Educational Testing Service. The pie chart depicts category weights (lighter colors, outer percentages) and average candidate performance (darker colors, inner percentages).

exam in this time frame.24 As this data set encompasses the entire test-taking population for this particular exam, metrics resulting from statistical estimates (e.g., p-values) are not necessary.25 Therefore, any nominal differences between means within each category ought to be interpreted as meaningfully different. Estimation of Categorical Percent Correct. Although category-level data provided by ETS included the number of correctly answered items per category for each test-taker, the total number of items in each category (which can vary from one exam administration to the next) was not provided. To estimate this value and a corresponding percentage score for each category, the highest number of correct items reported for any individual on a particular test form (i.e., version of the exam) was imputed as the total number of questions. This procedure was repeated for each of the 12 test versions (i.e., forms) and a weighted average of percent correct per test form (based on the number of examinees per test form) was used to generate estimated percent correct values per category. Estimation of Scaled Points Lost per Category. In the time frame considered in this study, 12 different test forms were administered. The variation in the number and difficulty of exam items across forms precludes the use of raw numbers of correct items (or corresponding percentages) when comparing relative performance across years or between candidates. ETS adjusts for exam difficulty when reporting total performance by converting raw scores to scaled scores (ranging from 100 to 200);26 however, these protocols are not adopted at the category level. By plotting individuals’ scaled score as a function of their percentage score for each individual test form, we noted a strong linear correlation between the two (a hypothetical distribution is shown in Figure S1). Because scaled scores of 100 and 200 generally correspond to several of the lowest and highest percentage scores, respectively, these values were excluded from our analysis to improve the predictive capacity of our linear model for all other scores. If aggregate scaled scores could be modeled by eq 1: SCag = m(RSag ) + b

where SCag is the aggregate scaled score, RSag is the aggregate raw score, m is the slope, and b is the y-intercept obtained from the linear fit, then the raw category scores (a subset of the aggregate raw score) could replace the aggregate raw score as shown in eq 2 SCag = m(RSC1) + m(RSC2 )... + intercept

(2)

where RSC1 and RSC2 represent the individual raw scores for categories 1 and 2, respectively. Repeating this procedure for each of the 12 test forms yielded an average R2 value of 0.998142 (Table S2). However, the ambiguity in the partial contribution of the y-intercept term to each category makes it difficult to conclusively calculate absolute scaled scores for each category. Given that our interest was primarily in the relative performance of examinees across demographics, a more accurate estimation was achieved by calculating the scaled points lost per category (eq 3), which eliminated the need to consider the y-intercept value (as it cancels out in the subtraction): SPLC1 = m(TQ C1) − m(CQ C1)

(3)

where SPLC1 is the scaled points lost in category 1, TQC1 is the total number of questions in category 1, and CQC1 is the number of correctly answered questions in category 1. An important limitation of this approach should be notedalthough performance between demographic groups can be compared within a single category (i.e., males vs females in “Atomic and Nuclear Structure”), performance cannot be compared between demographic groups across categories (i.e., males in “Atomic and Nuclear Structure” vs males in “Chemical Reactions; Periodicity”). ANOVA. To identify which of the 16 demographic characteristics most impacted examinee performance at the category level (i.e., number of scaled points lost per category), analysis of variance (ANOVA) calculations were performed for each of the 7 categories. The Bonferroni correction was applied to alleviate concerns with Type I error,27 producing a corrected α for 112 ANOVA comparisons of 0.00045.25 The results of the ANOVA for each category are reported in Tables S3−S9. For each category, the variables explaining more than 2% of the variance in scaled points lost (i.e., partial η2 values >0.02) are

(1) C

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Table 1. Total and Partial η2 Results of One-Way ANOVA Analyses for Examinee Performance in Each Category of the Praxis Chemistry Subject Assessmenta

graphically represented below with categories listed in order from largest total η2 to smallest, and individual variables are displayed in order from largest partial η2 to smallest.



Demographic Variable

FINDINGS

Partial η2

F(Between, Within)

Chemical Reactions; Periodicity undergraduate major 0.055575*** F(4, 13457) = 197.912 graduate major 0.048184*** F(4, 10268) = 129.949 education level 0.03628*** F(3, 16792) = 200.638 undergraduate GPA 0.028295*** F(2, 16773) = 244.206 geographic area in which teacher 0.020527*** F(3, 16690) = 88.012 plans to teach gender 0.020355*** F(1, 17454) = 362.667 total η2 0.30878  Nomenclature; Chemical Composition; Bonding and Structure undergraduate major 0.062278*** F(4, 13457) = 223.369 graduate major 0.042332*** F(4, 10268) = 113.470 education level 0.029074*** F(3, 16792) = 159.596 undergraduate GPA 0.02228*** F(2, 16773) = 191.113 total η2 0.269169  Solutions and Solubility; Acid−Base Chemistry undergraduate major 0.044394*** F(4, 13457) = 156.244 graduate major 0.035808*** F(4, 10268) = 95.333 education level 0.033623*** F(3, 16792) = 185.434 gender 0.02267*** F(1, 17454) = 404.869 undergraduate GPA 0.021737*** F(2, 16773) = 186.350 total η2 0.265909  Basic Principles of Matter and Energy; Thermodynamics undergraduate major 0.023673*** F(4, 13457) = 110.103 race or ethnicity 0.026897*** F(3, 16670) = 172.935 total η2 0.232583  Scientif ic Procedures and Techniques race or ethnicity 0.036644*** F(3, 16670) = 211.363 undergraduate major 0.034883*** F(4, 13457) = 121.560 graduate major 0.019783*** F(4, 10268) = 51.808 total η2 0.197325  Scientif ic Inquiry, Processes, and Social Perspectives race or ethnicity 0.040258*** F(3, 16670) = 233.028 gender 0.026063*** F(1,17454) = 467.083 education level 0.02353*** F(3, 16792) = 128.427 total η2 0.18615  Atomic and Nuclear Structure undergraduate major 0.031699*** F(4, 13457) = 92.735 race or ethnicity 0.030183*** F(3, 16670) = 139.473 graduate major 0.028766*** F(4, 10268) = 51.667 education level 0.027436*** F(3, 16792) = 84.645 undergraduate GPA 0.026897*** F(2, 16773) = 234.381 gender 0.023673*** F(1, 17454) = 185.991 total η2 0.166465 

Overall Category Performance

The average percent composition of each of the 7 Praxis Chemistry Subject Assessment categories from 2006−2016 (lighter shading, outer percentages) is shown in Figure 1. Within each ‘slice’, the darker shading (represented by the inner percentages) indicates the average percentage of items correctly answered by examinees. During this time, categories have been similarly weighted (10−16%), although questions on “Chemical Reactions; Periodicity” have composed the largest percentage of items (23%). Examinees have generally demonstrated the highest performance in the category “Atomic and Nuclear Structure”, earning an estimated 75% correct in this topic. “Solutions and Solubility; Acid−Base Chemistry” appears to have been the most challenging category for examinees, with an estimated 55% correct during this time frame. Individual Category Performance

To determine whether demographic characteristics that were academic (e.g., undergraduate major, GPA) or nonacademic (e.g., race or ethnicity, gender) in nature differentially explained the variance in scores at the category level, ANOVA was performed using examinee responses to the demographic questionnaire administered with the exam and their categorylevel performance (i.e., scaled points lost). The most impactful demographic characteristics within each category (those with partial η2 values >0.02, Table 1) and the relative performances of examinees of varying demographic backgrounds in these categories are reported for each category in Figures 2−8. Total η2 values for each category represent small effect sizes (i.e., 0.10−0.30), apart from that for “Chemical Reactions; Periodicity”, which falls in the lower bound of the medium range (Table 1). Partial η2 values for a number of key demographic variables shown in Table 1 (the complete listings are reported in Tables S3−S9) span the small-to-medium range, though the variable of undergraduate major in the category of “Nomenclature; Chemical Composition; Bonding and Structure” has a medium effect size. Longitudinal differences in performance for demographic groups across categories did not vary appreciably from year to year (data not shown), thus aggregate data (i.e., June 2006−May 2016) are reported in the analyses below. In the area of “Chemical Reactions; Periodicity”, the 16 reported demographic variables collectively explained 31% of the total variance in category scores. Undergraduate major, graduate major, education level, undergraduate GPA, the geographic locale in which examinees expected to teach, and gender explained 6, 5, 4, 3, 2, and 2% of the total variance in scores in this area, respectively (Table 1). Examinees with undergraduate majors in chemistry earned the highest scores, scoring an average of 2, 3, 3, and 4 scaled points more than “other” STEM, biology, education, and non-STEM majors, respectively (Figure 2). A similar trend in relative performance was observed for graduate majors, although examinees with graduate majors in education performed almost as well as those with “other” STEM graduate majors. Test-takers with bachelor’s degrees, those with master’s degrees, and those without bachelor’s degrees, performed similarly, scoring approximately 2 scaled points lower than those with doctoral degrees. Examinees with higher GPAs

Demographic variables with partial η2 values above 0.02 are reported here. For a full list of partial η2 values for all demographic variables, see Supporting Information Tables S2−S9). *, **, and *** correspond to corrected significance levels of 0.00052, 0.00010, and 0.000010, respectively.

a

performed better in this category than those with lower GPAs. With respect to the geographic area in which test-takers intended to teach (i.e., urban, rural, or suburban), those who did not plan to immediately teach performed roughly as well as those intending to teach in suburban districts, scoring approximately 1 and 2 scaled points higher than those intending to teach in urban or rural locales, respectively. The 16 reported demographic variables explained 27% of the variance in scores in the “Nomenclature; Chemical ComposiD

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 2. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Chemical Reactions; Periodicity” by demographic characteristic, derived from data provided by the Educational Testing Service.

Figure 3. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Nomenclature; Chemical Composition; Bonding and Structure” by demographic characteristic, derived from data provided by the Educational Testing Service.

As previously noted, “Solutions and Solubility; Acid−Base Chemistry” seemed to be the most challenging topic for examinees during this time frame. ANOVA revealed that 27% of the variance in performance was explained by the 16 demographic factors, with undergraduate major, graduate major, education level, gender, and undergraduate GPA explaining 4, 4, 3, 2, and 2% of the total variance, respectively. (Table 1). As shown in Figure 4, scores for examinees with various undergraduate majors mirrored those for graduate majors, with chemistry and “other” STEM majors outperforming education, biology, and non-STEM majors by 1−2 scaled points. Additionally, doctoral-degree holders again outperformed examinees of all other levels of education. On average, males outperformed females by roughly 1 scaled point, and testtakers with higher undergraduate GPAs scored higher than those with lower GPAs. In the area of “Basics of Matter and Energy; Thermodynamics”, the 16 reported demographic variables collectively

tion; Bonding and Structure” category (Table 1). Among these, undergraduate major, graduate major, education level, and undergraduate GPA were the most predictive variables, explaining approximately 6, 4, 3, and 2% of this variance, respectively (Table 1). The relative performances of subgroups within these demographic characteristics are shown in Figure 3. Chemistry majors were the highest performers, scoring about 2−2.5 scaled points higher than “other” STEM, biology, and education majors. Non-STEM majors were the lowest performers, scoring roughly 3 points lower than chemistry majors. Relative performance between examinees with various graduate majors exhibited a similar pattern to that observed for undergraduate majors. Trends in average performance as a function of education level followed those reported for “Chemical Reactions; Periodicity”, with doctoral-degree holders scoring the highest. Lastly, examinees with higher GPAs outperformed those with lower GPAs. E

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 4. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Solutions and Solubility; Acid−Base Chemistry” by demographic characteristic, derived from data provided by the Educational Testing Service.

In the “Scientific Procedures and Techniques” category, the 16 demographic variables together explained 20% of the overall variance in topic-specific scores (Table 1). Race or ethnicity explained the largest percentage of the variance (3.6%, Table 1), as Black test-takers scored 2.5 scaled points lower, on average, than White test-takers. Hispanic examinees and those of other races or ethnicities scored an average of 1 scaled point lower than White examinees (Figure 6). Examinees’ undergraduate and graduate majors explained 3 and 2% of the overall variance, respectively, in this category. Similar patterns of relative performance were observed for these demographic variables, with chemistry and “other” STEM majors performing roughly 1 scaled point higher than those with education, biology, and non-STEM majors. The 16 reported demographic variables explained 19% of the total variance in category scores in the area of “Scientific Inquiry and Social Perspectives of Science”, with race or ethnicity, gender, and education level explaining 4, 3, and 2% of the total variance, respectively. (Table 1). White test-takers scored approximately 1 scaled point higher than Hispanic examinees and those of other race or ethnicities, and roughly 2 scaled points higher than Black test-takers in this category (Figure 7). Additionally, male examinees outperformed females, on average, by 1 scaled point, and doctoral-degree holders again outperformed test-takers with other levels of education. In the category of “Atomic and Nuclear Structure”, 17% of the total variance in scores could be explained by the the 16 demographic variables. Test-takers’ undergraduate GPAs, undergraduate majors, races or ethnicities, graduate majors, and education levels each explained 3% of the total variance in scores in this topic, whereas gender explained 2%. (Table 1). Performance increased with increasing undergraduate GPA, and the relative performances of examinees with various undergraduate and graduate majors mirror the findings noted in the category of “Basic Matter and Energy; Thermodynamics” (Figure 8). Black test-takers lost an average of 2.5 scaled points more than White examinees, and Hispanic test-takers as well as those of “other” races or ethnicities performed similarly to their

explained 23% of the total variance in category scores. Undergraduate major and race or ethnicity partially explained 3 and 2% of the variance, respectively. Test-takers with undergraduate majors in chemistry or “other” STEM disciplines (e.g., physics, mathematics, and engineering) outperformed education and biology majors (who performed at the levels of examinees with non-STEM majors) by roughly 2 scaled points (Figure 5). Additionally, Black examinees were outperformed, on average, by White and Hispanic test-takers, as well as those of “Other” races or ethnicities by approximately 2−3 scaled points in this category.

Figure 5. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Basics of Matter and Energy; Thermodynamics” by demographic characteristic, derived from data provided by the Educational Testing Service. F

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 6. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Scientific Procedures and Techniques” by demographic characteristic, derived from data provided by the Educational Testing Service.

Figure 7. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Scientific Inquiry and Social Perspectives of Science” by demographic characteristic, derived from data provided by the Educational Testing Service.



White counterparts. Doctoral-degree holders scored 1−1.5 points more than examinees with other or no reported educational degrees, and these lower-scoring groups performed similarly to one another in this area, as seen in the previous categories. Lastly, males outperformed females, on average, in this category, with males scoring approximately 1.5 scaled points higher than females in this topic area.

DISCUSSION

Our topic-specific analysis of Praxis Chemistry Subject Assessment scores from 2006 to 2016 unearths several important findings about a large proportion of our nation’s prospective chemistry educators and the implications of such on instructional reform, chemistry-teacher professional-development (PD) efforts, and G

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 8. Praxis Chemistry Subject Assessment relative performances (i.e., average numbers of scaled points lost) in “Atomic and Nuclear Structure” by demographic characteristic, derived from data provided by the Educational Testing Service.

future research in this area. “Solutions and Solubility; Acid−Base Chemistry” has been the most challenging category for examinees (Figure 1), which is consistent with literature that cites the particular difficulty of these topics for chemistry students.28−34 As examinee performance is possibly an indication of what students retained from their university-level coursework (typically just a few semesters removed), these are areas in which instructional strategies that promote deep conceptual understanding may be especially beneficial. Higher-education faculty should also note that more than 80% of examinees report earning GPAs between 3.0 and 4.0 (Table S1), suggesting that even top students have struggled to learn or retain knowledge in this area. Our results indicate that these content areas may be a worthwhile place to begin when considering reform. Encouragingly, demographic characteristics indicative of academic achievement or preparedness (i.e., undergraduate major, graduate major, undergraduate GPA, and education level) explained the largest proportion of the variance in category scores in two of the seven categories (“Chemical Reactions; Periodicity” and “Nomenclature; Chemical Composition; Bonding and Structure”, Table 1). These results suggest that this assessment may appropriately discriminate between test-takers in this category on the basis of knowledge alone, possibly validating reform-based instructional approaches that have concentrated efforts in these areas. Of note, however, are the remaining five topics, in which nonacademic descriptors (e.g., race or ethnicity and gender) have explained more of the variance in category scores (i.e., had largest partial η2 values) than any academic predictors (“Scientific Inquiry and Social Perspectives of Science” and “Scientific Procedures and Techniques”) or explained a significant proportion of the variance (“Solutions and Solubility; Acid−Base Chemistry”, “Basics of Matter and Energy; Thermodynamics”, and “Atomic and Nuclear Structure”). Our findings suggest that undergraduate education in these particular areas may be appropriate initial targets for more diverse and inclusive instructional strategies. It is noteworthy to find that chemistry majors are the highest performers on average and across topics. These findings further

support policy decisions to ensure that teachers have adequate content expertise in their subject area. However, education majors continue to perform at the level of non-STEM majors in all of the discipline-specific content categories, which is consistent with findings from other reported STEM Praxis Subject Assessment analyses.35,36 More synergistic relationships between colleges of education and disciplinary departments may be key in ensuring that this population of majors, who are the most likely to pursue teaching careers, are at least as prepared as disciplinary majors in their content area. Efforts to house chemistry education majors in chemistry departments, for example, may ensure that these students are held to the same academic standards as disciplinary majors. Additionally, as a large proportion of chemistry educators are reported to have biology- and STEM-education backgrounds,18,19 contentspecific PD in categories of relative weakness, particularly those of “Nomenclature; Chemical Composition; Bonding and Structure” and “Chemical Reactions; Periodicity”, might serve to better support these particular out-of-field teachers in improving in their practice. A more individualistic approach to this type of PD might be achieved at the level of individual examinees or schools, who ought to reflect on prospective teachers’ category-level performance to guide educators toward specific PD opportunities that focus on their own content areas of weakness. Additionally, the efforts of national- and state-level initiatives like Noyce, UTeach, and PhysTEC may be similarly influenced by these results in their design and implementation of content-specific PD. Further, results from this study may guide future research investigations of racial and gender achievement gaps. For example, race or ethnicity and gender explained the largest proportion of variance in the “Scientific Inquiry and Social Perspectives of Science” category. Instruction and curricula in this topic area may present particular challenges to at-risk populations, and it would be worthwhile to investigate whether culturally relevant and inclusive curricula might be beneficial in educating students about the nature of our discipline. Overall, the findings from this study should inform policy decisions aimed at improving the quality and diversity of the chemistryteaching workforce. H

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education



Article

LIMITATIONS As effective teaching is impacted by several constructs not examined here (e.g., pedagogical content knowledge and administrative support), claims about which types of prospective teachers will be more effective than others cannot be made from our data. As previously noted, content knowledge is critical, but it is simply one factor in the complex process of development into an effective educator. Additionally, although it is reasonable to assume that the overwhelming majority of Praxis Chemistry Subject Assessments examinees are aspiring teachers, there is likely some fraction of test-takers who may have been motivated to take the exam for other reasons,which may influence our conclusions. Our estimation of scaled points lost per category is just that (i.e., an estimate), and the truncation of the highest and lowest scores from our linear-regression analysis in estimating these values may have impacted our results and interpretation. For example, those with scaled scores of 100 may have earned nearly perfect scores in some categories but much lower scores in other categories. The lack of access to individual assessment items further limits the conclusions that can be drawn from this study. Although the constructs measured in Praxis Chemistry Subject Assessments items are consistent with the title of the encapsulating category, without access to the wording and content of these questions it is difficult to know to what extent each topic is covered and, therefore, accurately assess what specific concepts cause examinees to struggle. This is particularly true for categories that measure multiple constructs (e.g., “Nomenclature; Chemical Composition; Bonding and Structure”). Analysis of examinee performance at the item level would be a worthwhile next step for those with access to the assessment questions. Additionally, because of differences in category weights and numbers of items, it is difficult to compare absolute scaled points lost across categories. Lastly, several of the most populous U.S. states administer individual state assessments in place of the Praxis Chemistry Subject Assessment, which limits a more complete understanding of the categorical content knowledge of prospective chemistry teachers nationwide.



University (DUE-1557285) and Stony Brook University (DUE-1557292). The opinions set forth in this publication are those of the authors and do necessarily not represent those of the NSF or ETS.



(1) Ingersoll, R. M. Teacher Turnover and Teacher Shortages: An Organizational Analysis. Am. Educ. Res. J. 2001, 38 (3), 499−534. (2) Harris, D. N.; Adams, S. J. Understanding the Level and Causes of Teacher Turnover: A Comparison with Other Professions. Econ. Educ. Rev. 2007, 26 (3), 325−337. (3) Borman, G. D.; Dowling, N. M. Teacher Attrition and Retention: A Meta-Analytic and Narrative Review of the Research. Rev. Educ. Res. 2008, 78 (3), 367−409. (4) Reynolds, A. What Is Competent Beginning Teaching? A Review of the Literature. Rev. Educ. Res. 1992, 62 (1), 1−35. (5) Adams, P. E.; Krockover, G. H. Concerns and Perceptions of Beginning Secondary Science and Mathematics Teachers. Sci. Educ. 1997, 81 (1), 29−50. (6) Harris, D. N.; Sass, T. R. Teacher Training, Teacher Quality and Student Achievement. J. Public Econ. 2011, 95 (7−8), 798−812. (7) Robert Noyce Teacher Scholarship Program. http://nsfnoyce. org/ (accessed Aug 15, 2018). (8) Marshall, J. Replicating the UTeach Model for 10 Years: Where We Are and What We Have Learned. Bull. Am. Phys. Soc. 2016, 61 (3). (9) Wolter, M.; Grosnick, D.; Watson, J.; Ober, D.; Smith, W. PhysTEC - An Induction/Mentoring Model for Pre- and In-Service Physics and Science Teachers. Proceedings of the American Physical Society, Ohio Section Spring 2004, Athens, OH, April 16−17, 2004; Abstract B8.007. (10) Shulman, L. Those Who Understand: Knowledge Growth in Teaching. Educ. Res. 1986, 15 (2), 4−14. (11) Shulman, L. Knowledge and Teaching: Foundations of the New Reform. Harv. Educ. Rev. 1987, 57 (1), 1−23. (12) Sanders, L. R.; Borko, H.; Lockard, J. D. Secondary Science Teachers’ Knowledge Base When Teaching Science Courses in and out of Their Area of Certification. J. Res. Sci. Teach. 1993, 30 (7), 723−736. (13) Luft, J. A.; Roehrig, G. H. Inquiry Teaching in High School Chemistry Classrooms: The Role of Knowledge and Beliefs. J. Chem. Educ. 2004, 81, 1510−1516. (14) de Jong, O.; Veal, W. R.; Van Driel, J. H. Exploring Chemistry Teachers’ Knowledge Base. In Chemical Education: Towards Researchbased Practice; Gilbert, J. K., de Jong, O., Justi, R., Treagust, D. F., van Driel, J. H., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2002; pp 369−390. (15) Sadler, P. M.; Sonnert, G.; Coyle, H. P.; Cook-Smith, N.; Miller, J. L. The Influence of Teachers’ Knowledge on Student Learning in Middle School Physical Science Classrooms. Am. Educ. Res. J. 2013, 50 (5), 1020−1049. (16) Hashweh, M. Z. Effects of Subject-Matter Knowledge in the Teaching of Biology and Physics. Teaching and Teacher Education 1987, 3 (2), 109−120. (17) Gess-Newsome, J.; Lederman, N. G. Preservice Biology Teachers’ Knowledge Structures as a Function of Professional Teacher Education: A Year-long Assessment. Sci. Educ. 1993, 77 (1), 25−45. (18) Rushton, G. T.; Ray, H. E.; Criswell, B. A.; Polizzi, S. J.; Bearss, C. J.; Levelsmier, N.; Chhita, H.; Kirchhoff, M. Stemming the Diffusion of Responsibility: A Longitudinal Case Study of America’s Chemistry Teachers. Educ. Res. 2014, 43, 390−403. (19) Rushton, G. T.; Dewar, A.; Ray, H. E.; Criswell, B. A.; Shah, L. Setting a Standard for Chemistry Education in the Next Generation. ACS Cent. Sci. 2016, 2, 825−833. (20) ETS convened a National Advisory Committee and conducted a job survey, and revised specifications went into effect in 2014; however, the changes were small enough that the new version of the test was equatable to the old version, avoiding rescaling and allowing equated scaled scores from the old and new versions to be compared.

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available on the ACS Publications website at DOI: 10.1021/acs.jchemed.8b00365. Full list of demographic characteristics of Praxis Chemistry Subject Assessment examinees from 2006−2016, r2 values for raw-to-scaled score conversion, and detailed ANOVA results (PDF, DOCX)



REFERENCES

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Lisa Shah: 0000-0001-8700-0924 Gregory T. Rushton: 0000-0002-8687-132X Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors would like to thank Jonathan Steinberg of the Educational Testing Service. This work was funded through a National Science Foundation Award to Kennesaw State I

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

(21) Shah, L.; Hao, J.; Schneider, J.; Fallin, R.; Linenberger Cortes, K.; Ray, H. E.; Rushton, G. T. Repairing Leaks in the Chemistry Teacher Pipeline: A Longitudinal Analysis of Praxis Chemistry Subject Assessment Examinees and Scores. J. Chem. Educ. 2018, 95 (5), 700−708. (22) Study Companion: Chemistry: Content Knowledge; Educational Testing Service: Princeton, NJ, 2017. (23) Technical Manual for The Praxis Series and Related Assessments; Educational Testing Service: Princeton, NJ, 2015. (24) In the time between our previous publication using this data set and this work, updated data sets were obtained such that the available population size increased from 15,564 to 17,527. Given the large size of the population, the relatively proportional increases for various responses to the demographic questionnaire (Table S1), and our own unpublished analyses, this increase does not appreciably change our conclusions in the previous work. (25) Hair, J. F.; Black, W. C.; Babin, B. J.; Anderson, R. E.; Tatham, R. L. Multivariate Data Analysis; Prentice Hall: Upper Saddle River, NJ, 1998. (26) Educational Testing Service. Praxis Subject Assessments Overview. https://www.ets.org/praxis/about/subject/ (accessed August 30, 2018). (27) Vasey, M. W.; Thayer, J. F. The Continuing Problem of False Positives in Repeated Measures ANOVA in Psychophysiology: A Multivariate Solution. Psychophysiology 1987, 24 (4), 479−486. (28) Pedrosa, M. A.; Dias, M. H. Chemistry textbook approaches to chemical equilibrium and student alternative conceptions. Chem. Educ. Res. Pract. 2000, 1 (2), 227−236. (29) Raviolo, A. Assessing Students’ Conceptual Understanding of Solubility Equilibrium. J. Chem. Educ. 2001, 78 (5), 629−631. (30) Demerouti, M.; Kousathana, M.; Tsaparlis, G. Acid-Base Equilibria, Part I. Upper Secondary Students’ Misconceptions and Difficulties. Chem. Educ. 2004, 9 (2), 122−131. (31) Demerouti, M.; Kousathana, M.; Tsaparlis, G. Acid-Base Equilibria, Part II. Effect of Developmental Level and Disembedding Ability on Students’ Conceptual Understanding and Problem-Solving Ability. Chem. Educ. 2004, 9 (2), 132−137. (32) Bhattacharyya, G. Practitioner Development in Organic Chemistry: How Graduate Students Conceptualize Organic Acids. Chem. Educ. Res. Pract. 2006, 7 (4), 240−247. (33) Cooper, M. M.; Kouyoumdjian, H.; Underwood, S. M. Investigating Students’ Reasoning about Acid−Base Reactions. J. Chem. Educ. 2016, 93 (10), 1703−1712. (34) Shah, L.; Rodriguez, C. A.; Bartoli, M.; Rushton, G. T. Analysing the impact of a discussion-oriented curriculum on first-year general chemistry students’ conceptions of relative acidity. Chem. Educ. Res. Pract. 2018, 19 (2), 543−557. (35) Shah, L.; Hao, J.; Rodriguez, C. A.; Fallin, R.; LinenbergerCortes, K.; Ray, H. E.; Rushton, G. T. Analysis of Praxis physics subject assessment examinees and performance: Who are our prospective physics teachers? Phys. Rev. Phys. Educ. Res. 2018, 14, 010126. (36) Shah, L.; Butler Basner, E.; Tamayo, M.; Linenberger-Cortes, K.; Ray, H. E.; Rushton, G. T. An Analysis of Praxis Physics Subject Assessment Category Performance: What do our Beginning Physics Teachers Know about Physics? Submitted for publication.

J

DOI: 10.1021/acs.jchemed.8b00365 J. Chem. Educ. XXXX, XXX, XXX−XXX