Refinement of a Chemistry Attitude Measure for College Students

Mar 18, 2011 - Regarding the attitude−achievement relationship, students' attitudes played .... To evaluate the quality of a revised ASCI (V2) for a...
0 downloads 0 Views 828KB Size
ARTICLE pubs.acs.org/jchemeduc

Refinement of a Chemistry Attitude Measure for College Students Xiaoying Xu and Jennifer E. Lewis* Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States

bS Supporting Information ABSTRACT: This work presents the evaluation and refinement of a chemistry attitude measure, Attitude toward the Subject of Chemistry Inventory (ASCI), for college students. The original 20-item and revised 8-item versions of ASCI (V1 and V2) were administered to different samples. The evaluation for ASCI had two main foci: reliability and validity. This study provides evidence for good reliability and validity of ASCI (V2) scores. Reliability was estimated by satisfying internal consistency and testretest reliability. The two-factor (“intellectual accessibility” and “emotional satisfaction”) correlated structure was supported by confirmatory factor analysis. The two subscales could be indicators of cognition and affect components of attitude, and thus make a good connection with a theoretical framework from psychology. Regarding the attitudeachievement relationship, students’ attitudes played a significant role in predicting final achievement in a general chemistry course even when initial ability scores were taken into account. Implications for science education are discussed. KEYWORDS: First-Year Undergraduate/General, Graduate Education/Research, Chemical Education Research, Testing/Assessment FEATURE: Chemical Education Research

’ WHY ATTITUDE? The term “attitude” falls within the purview of “scientific literacy”, which is a central goal of science education. Usually, scientific literacy focuses on the cognitive knowledge dimension, as highlighted by the proposition “the scientifically literate person accurately applies appropriate science concepts, principles, laws, and theories in interacting with his [sic] universe”.1 However, many science educators emphasize that noncognitive factors such as values and attitudes are important components of science literacy. According to the American Association for the Advancement of Science,2 spelling out the “knowledge, skills, and attitudes all students should acquire as a consequence of their total school experience” is a requirement for a curriculum to be considered as promoting scientific literacy. Here AAAS places attitudes on an equal footing with knowledge and skills. Appropriately, many research studies have investigated students’ attitudes toward learning science.37 The last thing educators want to see is students scoring high on standard tests, but thinking that science is depressing, boring, or otherwise unpleasant, and never again using their scientific knowledge after it is no longer compulsory to do so. High-quality science courses that promote both content knowledge and a positive attitude toward science are important for students to stay in advanced science programs and to pursue science-related careers. Science educators have made great efforts to develop innovative programs, with emphasis on problem solving, inquiry-based learning, hands-on activities, real-world contexts, and computer-aided

A

ssessment is an essential component of education. Results from assessments serve diverse functions for diagnosis, placement, prediction, and so forth. Course instructors rely on tests, for example, to obtain information on students’ mastery of content knowledge and other contextual variables, such as problem solving and creativity. At another level, most colleges use SAT or ACT test scores for college admissions, and GRE test scores for graduate admissions. Test scores allow comparisons, although a long and vigorous debate has taken place among education stakeholders regarding how to interpret test scores and create education policy. The debate has developed into a national concern since the passage in 2001 of the No Child Left Behind Act (NCLB), which includes a strong focus on assessment. Critics focus on the limitations of tests, potential misinterpretations of test scores, and the unintended consequences of a testing program. It is often difficult for an educator to evaluate the arguments for and against a particular approach. While there are no easy answers, a better understanding of educational measurement theory can provide necessary nuance to discussions of education policy. The full assortment of measurement theories and practices is beyond the scope of this paper, but the basics are pertinent for all of us who teach chemistry and are aware of the need to improve science, technology, engineering, and mathematics (STEM) education outcomes at the national level. For the purpose of assessing both academic achievement and noncognitive variables such as attitude, the first and most important thing is to find “good” assessments. This study will present one way to proceed with this task. Copyright r 2011 American Chemical Society and Division of Chemical Education, Inc.

Published: March 18, 2011 561

dx.doi.org/10.1021/ed900071q | J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

instruction to engage students in the excitement of doing science.816 The ideal is a curriculum that supports both gains in content knowledge and positive attitudes toward science. While most practitioners are comfortable creating their own measures of content knowledge, and feel they can recognize a “bad test” or a “bad question” if necessary, fewer are likely to have the necessary skills and comfort level to create and evaluate a measure of attitude. They may turn to existing instruments, and either use these directly or with slight adaptations to fit a particular course. Unfortunately, most existing studies of attitude fail to scrutinize the validity of the scores produced by the instruments they have chosen to use, which can raise questions about study results. Conflicting results, in particular, highlight the importance of using well-characterized instruments, so that the instrument itself can be ruled out as a major source of disagreement. For example, an examination of literature on attitude and STEM achievement reveals conflicting results.1722 While some claim a low correlation between attitude and achievement, others claim the two are strongly positively correlated. Is this disagreement an artifact of the way attitude was measured (differently in each case!), or a real difference stemming from the diverse contexts of the studies? What if, as in the example above, no readily available highquality instrument exists that has a long history of working well in the relevant context? With much science education research focused on the K12 environment, it is often difficult to find high-quality instruments that are appropriate for use in the college science classroom. Creating a high-quality instrument is a research project in itself, and one that many faculty would not be interested in undertaking. However, it is possible to take an existing instrument, test it in the desired setting, and quickly optimize it for use. This research report seeks to offer an example of carefully modifying an existing attitude instrument to achieve better quality, while reviewing appropriate strategies for any instrument development process: reliability, validity, factor analysis, and the alignment of any subscales with a theoretical framework. These issues must be addressed in deciding whether an assessment can be considered “good”.

The above definition of attitude is easy to understand and was designed for experimental verification. However, since the early days, many variants have been developed, and there is considerable debate even now about how to describe and measure attitude. While the early definition has its roots in behaviorism, a widely used contemporary definition of attitude is “a psychological tendency that is expressed by evaluating a particular entity with some degree of favor or disfavor”.26 This more modern definition, deriving from cognitivism, emphasizes the evaluative nature of attitude, which is further conceptualized as:27 [Attitude comprises] all classes of evaluative responding, whether overt or covert, or cognitive, affective, or behavioral. Evaluation thus encompasses the evaluative aspects of beliefs and thoughts, feelings and emotions, and intentions and overt behavior. This so-called neotripartite model provides a useful link with previous attitude research for further theory development. The three-component structure common to both definitions is supported by many studies.2729 A flourishing literature on attitude and related concepts can be found. Because of different theory sources, the attitude concept is rarely precisely defined and is operationalized inconsistently across studies. Even in the early days, researchers used attitude in a vague way that “involve[d] instincts, appetites, habits, emotions, sentiments, opinion, and wishes”.30 One focus of study in the last century was how attitudes form and change, as elaborated in theories of cognitive dissonance, self-perception, value-expectancy, self-efficacy, dissonance-reduction, self-regulated learning, planned behavior, and so on.3135 However, in many other cases, instruments used to collect data regarding attitudes are not based on a specific theory, and evidence of reliability and validity is also lacking, so it is hard to interpret the results of different studies and legitimate their findings.36 With such difficulties even within the field of psychology, it is important to clarify the theoretical basis for attitudinal measures we might choose to use within chemistry education research. Attitude Concept in Science Education

One of our most important educational goals is to foster students’ positive attitudes toward learning science, so they can be inspired to continue to develop scientific literacy throughout their lives.37 The tripartite structure (affective, behavioral, cognitive) of the attitude concept is helpful in considering the ways in which attitudes toward science have been defined. Gardner’s definition, “the emotional reactions of students towards science... interest, satisfaction, and enjoyment”,37 puts emphasis on the affective aspect. An alternative definition is more cognitive: “a learned, positive, or negative feeling about science that serves as a convenient summary of a wide variety of beliefs about science”.38 Both of these definitions view attitude as a unitary concept and ignore its other aspects. We would argue that reuniting the affective and cognitive aspects of attitude but retaining a two-component framework has advantages over unitary structure in the area of science education. First, educators typically care about both cognitive and affective issues. Just as a person’s attitude toward ice cream has a cognitive component (unhealthy, not a part of a balanced diet) and an affective component (yummy!), so do students often say science is challenging (cognitive) yet interesting (affective). The affective and cognitive components of attitude remain conceptually distinct. It is helpful to know students’ answers to both kinds of questions, rather than lumping them together to get a single

Attitude Concept in Social Psychology

Attitude is one of the most important concepts in social psychology, dating back to the ancient philosophers.23 Most psychologists know well the historical definition of attitude:24 [Attitude is a] mental and neural state of readiness to respond, organized through experience, exerting a directive and/or dynamic influence upon the individual’s response to all objects and situations with which it is related. In other words, attitudes are considered as tendencies or predispositions to respond to certain stimuli, and the traditional tripartite model comprises three major types of responses: cognitive, affective, and behavioral.25 Given a particular object about which an attitude exists, “affect” pertains to how people feel about the object (both good and bad feelings), as expressed via physiological activity or overt communication. “Cognition” refers to how people think about the object, that is, knowledge and beliefs about properties of the object (including both favorable and unfavorable judgments). “Behavior” is overt actions with respect to the object as well as intentions to act (again, both positive and negative actions and intentions). The object of attitude can be something material, such as the game of baseball, or something abstract, such as romantic love. 562

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

Table 1. Summary of Administrations for Different Versions of ASCIa

a

ASCI Version

Number of Items

Number of Scales

Time of Administration within School Term

Population

Major Changes

V1b

20

5

Near the end of semester

General chemistry lab at UNHb

Original, or V1

V1

20

5

2/22/6, 4th week, Sp09

General chemistry lab I, II, at SEc

Replication of V1

V2

8

2

3/27, 11th week, Sp09

General chemistry class at SEc

Short form, or V2

Administrations are organized by time. b UNH: University of New Hampshire; see ref.42 c SE: a large southeastern U.S. public research university.

attitude score or simply gathering information regarding one or the other. We would also argue that, when the science subject in general is the object of attitude, instruments that exclude the behavior component are best for many research purposes. Although attitude can be inferred from behavior, behavior is not at the same level of abstraction as cognition and emotion. In that sense, concrete items about behavior on an instrument can make it harder for respondents to focus on accurate reporting for more abstract items relating to emotions or beliefs. Also, because desirable behavior patterns can vary dramatically for instructional settings, it is hard to create behavioral items suitable for different situations in order to estimate attitudinal differences. Furthermore, when attitude is to be treated as an indicator of future behavior, as the Theory of Planned Behavior recommends, excluding behavior from the attitude study is meaningful.39 Therefore, we have chosen the two-component framework for our study. A review of existing instruments from this Journal4049 failed to produce one that would both fit the two-component theoretical framework and meet modern psychometric standards.50,62 In general, the field is fertile ground for development work, particularly if existing instruments could be easily modified while the quality of validity and reliability evidence associated with them improves. The Attitude toward the Subject of Chemistry Inventory (ASCI)42 was chosen as a model instrument for this approach. A discussion of our review is presented in the online Supporting Information.

General Research Methods

Instruments. For the first administration, ASCI (V1) was used. It has 20 pairs of objectives grouped into 5 subscales to tap students’ attitudinal status toward chemistry in general. For the second administration, the revised ASCI (V2) was used. ASCI (V2) (see the online Supporting Information) is a short form of V1. It has 8 pairs of objectives organized in 2 subscales. Participants. Information about the participants for the two administrations is summarized in Table 1. ASCI (V1) was given in all the sections of a general chemistry laboratory I and a general chemistry laboratory II course at a large southeastern public research university during the 4th week of the Spring 2009 semester. ASCI (V2) was given to all the general chemistry I discussion sections during the 11th week of the Spring 2009 semester (two days after the third term exam and after the course drop date) at the same institution. Data Collection and Screening. For each administration, the instrument was given to intact classes as a paper-and-pencil test at the beginning of the class. In one lab section, 10 students were asked to take the survey twice, before and after the lab experiment, to calculate testretest reliability. Students were verbally instructed to answer with their feelings toward chemistry in general, rather than to a specific teacher or course. A 10-choice scantron form (with response options from AJ or from 09) was used to collect the data. Because the instrument uses a 7-point Likert scale, the intended responses on the scantron form should range from 1 to 7. Any response with multiple answers, missing data, missing ID, or a response beyond the intended range was excluded for all the analysis below. The pattern of missing data was checked to determine whether missing data might bias the findings, but no evidence of potential bias was found.

’ RESEARCH GOALS In undertaking this work we identified these two specific goals: 1. To evaluate the quality of the original ASCI (V1) for our sample and to compare the factor structure of our data set with that in the literature. 2. To evaluate the quality of a revised ASCI (V2) for a new sample.

Data Analysis Strategy

To accomplish research goals, data from each administration were analyzed for reliability and validity evidence. Various statistics were employed, including descriptive statistics, reliability, correlation among different concepts, multiple regression models to predict chemistry achievement, and factor analysis (exploratory and confirmatory). All descriptive statistics were performed in SPSS 17.0 for each item score after the negatively stated items were recoded. Mean scores are reported in both scales for better understanding and comparison. Original scores range from 1 to 7, while percentage scores (used by Bauer) range from 0 to 100 (1 = 0%, 4 = 50%, and 7 = 100%). Examination of skewness and kurtosis (both less than 2) revealed good normality of the item scores. Internal consistencies were calculated by Cronbach’s R for each subscale. Testretest reliability was obtained for ASCI (V1). Factor scores were created by adding scores of all the items associated with the factor. Differences in factor scores of student groups were quantified using Cohen’s d effect-size guidelines (d > 0.2, small; d > 0.5, medium, d > 0.8, large).51 Cohen’s d reveals how many standard deviation units apart the two groups are on the factor score.

’ METHOD The original ASCI (V1) instrument was administered to two samples of college chemistry students, and the revised ASCI (V2) instrument was administered to a sample of college chemistry students. The responses from all three administrations were evaluated in terms of psychometric properties and compared with literature results where appropriate. The evaluations had two main foci: reliability and validity. This method section is organized into three parts. The first is the summary of general research methods that were common to all administrations. The second is the scale reconstruction, that is, using results from ASCI (V1) to create meaningful subscales aligned with the general theory of the attitude concept. The third is the gathering of additional validity evidence for ASCI (V2) scores, including the discussion of a nomological network and of predictive validity. 563

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education Factor correlation values were measured in two ways. In the traditional way, factor correlations are based on the factor score as the sum of all grouping items. Because of the existence of measurement error, these correlations can be underestimated. Factor correlations obtained via confirmatory factor analysis are corrected for measurement error. Both values are reported to support discussion of how well students discriminate the different scales. Exploratory factor analysis (EFA) was performed in SPSS 17.0 to explore the internal structure of the ASCI (V1) data. As the EFA reported in the literature was obtained via principal components analysis with varimax rotation, we repeated this procedure for comparison. To decide the number of factors to extract, the eigenvalue-greater-than-one rule was accompanied by more comprehensive approaches, such as viewing the scree plot, parallel analysis, and considering interpretability. Confirmatory factor analysis (CFA) to estimate how well the model fits the data was performed in Mplus 5.2. It was run on a first-order model (four-factor solution for ASCI, V1, and twofactor for ASCI, V2), where the latent factors were set to correlate with each other. Using the variancecovariance matrix of measured items as indicated by Cudeck,52 a maximum-likelihood method of estimation was employed. All items were set to load on their assumed factors only. The model was identified by fixing the first item on each factor at 1. In general, models based on a large number of scores are likely to have an inflated χ2 value for the model fit, so it is a good idea to examine two additional fit statistics: if the comparative fit index (CFI) is greater than 0.95 (some may use CFI greater than 0.90; see ref 53) and the standardized root-mean-squared residual (SRMR) is less than 0.08, the model can be considered a good fit to the data.54 These criteria were used consistently for the examination of model fit. For CFA, it is important to falsify plausible rival hypotheses, because untested models may fit the same data even better than the proposed model. When the two-factor model fit was supported by CFA for ASCI (V2) data, we further tested the plausible rival hypothesis of a one-factor model for the most parsimonious solution.

ARTICLE

measure students’ chemistry attitudes in terms of emotion and cognition. Student demographic information and test scores representing students’ ability were collected from the registrar’s office. The ability measure consisted of the quantitative portion of the SAT (SATM) or of the ACT (ACTM). At the end of the semester, students’ scores on the First-Term General Chemistry Blended Examination from the Examinations Institute of the ACS Division of Chemical Education56 (40 items) were obtained. Because of confidentiality requirements for using the ACS Exam, no test item can be shown here. The distribution for each variable was examined. Scatterplots of each pair of variables were also examined. The relationships among variables were summarized using correlation coefficients. A consistent effect-size guideline (0.10 small, 0.30 medium, 0.50 large)51 was used to interpret the magnitude of coefficients. Predictive Validity

Multiple regression analysis was performed in SAS 9.13 to predict student achievement in general chemistry based on scores from ASCI (V2), SATM, and the ACS exam. Three different regression models were tested. Data were checked for possible violations of assumptions, including assumptions of normality, linearity, and homoscedasticity, which may greatly affect the results when violated.57 Influential outliers were examined by studentized residuals and Cook’s D. The prediction equation and semipartial correlations are reported for the best and most parsimonious model.

’ RESULTS AND DISCUSSION Quality Evaluation for ASCI (V1)

There were 405 complete responses for the ASCI (V1) administered to general chemistry I lab and 509 responses for general chemistry II lab. The online Supporting Information provides a detailed discussion of the quality evaluation of these scores. In general, the instrument performed very similarly in our setting as compared to the original setting. First, we examined the reliability of the subscales comprising more than one item “interest and utility”, “anxiety”, “intellectual accessibility”, and “emotional satisfaction”. Cronbach’s R was 0.82, 0.71, 0.79, and 0.74 for the general chemistry I laboratories for each subscale, respectively, and 0.85, 0.79, 0.82, and 0.78 for general chemistry II laboratories, which are all above satisfactory level (0.70), and comparable to the literature results. The testretest (before and after the lab) reliability was above 0.9 and better than the literature report. From these analyses, our instrument data has very good reliability. Regarding factor analysis, because the original EFA did not cleanly resolve into factors, we were not surprised to see similar results with our own data. When we turned to CFA as a tool for determining whether the proposed factor structure was tenable, the fit statistics were insufficient. After this finding, we moved through a process of scale reconstruction and refinement (fully described in the online Supporting Information) in order to propose a new, shorter version of the ASCI instrument, V2. The short version retains eight items in two subscales, “intellectual accessibility” (items 1, 4, 5, and 10 from V1) and “emotional satisfaction” (items 7, 11, 14, and 17 from V1).

Scale Reconstruction Based on ASCI (V1)

The online Supporting Information describes our stepwise scale reconstruction process in more detail. In general, after comparison of a local administration of ASCI (VI) with literature results and the evaluation of data quality, EFA was used to examine the internal structure of the data, and both CFA and theory guided the construction of revised subscales with conceptual meaning. Administering the revised instrument formed the final phase for the current work. Additional Validity Evidence for ASCI (V2) Scores

Nomological Network among Attitude, Ability, and Chemistry Achievement. A nomological network pertains to finding empirical evidence of “lawful” sets of relationships between the measured construct and other theoretical concepts or observable properties in order to make a case for the existence of the construct.55 A robust nomological network grounds a measured construct in a web of proposed relationships, the strengths and directions of which, as determined empirically, match theoretical predictions. Accordingly, correlational and experimental analysis are the most-used techniques to support nomological network building. Correlational analysis among attitude, ability, and achievement was performed in SAS 9.13. Data were obtained from the same population as ASCI (V2) data. ASCI (V2) was used to

Quality Evaluation for ASCI (V2)

To accomplish the second research goal, ASCI (V2) was given at the beginning of all the discussion sections for general chemistry I at the southeastern public research university during the 11th week of the Spring 2009 semester. From 375 sets of data 564

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

Table 2. Descriptive Statistics for ASCI (V2) in Discussion Sections, with Lab I Results Lab I, N = 405

Item Number

Meana

Items

SDa

Discussion Section, N = 354 Meana

SDa

V1

V2

1

1b

hard

easy

2.60

1.27

2.81

1.28

4

2

complicated

simple

2.80

1.52

2.95

1.43

5

3

confusing

clear

3.22

1.50

3.36

1.40

14

4b

uncomfortable

comfortable

3.76

1.39

3.64

1.36

7

5b

frustrating

satisfying

3.26

1.71

3.24

1.70

10

6

challenging

unchallenging

2.31

1.35

2.50

1.50

11

7b

unpleasant

pleasant

3.63

1.51

3.38

1.41

17

8

chaotic

organized

4.03

1.62

4.26

1.66

a

Each score should range from 1 to 7; 4 is the midpoint. Higher scores mean students feel chemistry is intellectually accessible, emotionally satisfying. Item 8 has the highest mean of 4.26, and item 6 has the lowest mean of 2.50, which indicates students feel chemistry is organized, and is challenging. b Item score is reversed before averaging; items are shown reversed for ease of interpretation.

returned, 354 responses with no missing data were used for data analysis. Average scores for each item from those 354 respondents ranged from 2.50 to 4.26, with standard deviations from 1.28 to 1.70 (Table 2). No item was found to have skewness or kurtosis greater than 0.7, which suggests good normality of the scores. The mean scores are comparable to the scores constructed for a similar sample of students who took ASCI (V1) in the general chemistry I laboratory (Table 2). The internal consistency reliability was measured for each subscale. Cronbach’s R is 0.82 and 0.79 for the “intellectual accessibility” and “emotional satisfaction” subscales, respectively. Both values are above the rule-of-thumb satisfactory level of 0.7, and comparable to ASCI (V1) and the literature results. Factor scores were created by summing the items within each subscale: 2.91 (31.8% if converted to a percentage) for intellectual accessibility and 3.63 (43.8%) for emotional satisfaction, with a correlation of 0.64 between the two factors. Factor scores and correlation for this administration of ACSI (V2) are very similar to those calculated from the ASCI (V1) data for a similar sample of students, which indicates these eight items function similarly even when the other 12 items are removed from the instrument. CFA was performed to estimate goodness of fit for the twofactor model. Items 1, 2, 3, and 6 were set to load on the factor “intellectual accessibility” only; items 4, 5, 7, and 8 were set to load on the factor “emotional satisfaction” only, and the two factors were allowed to correlate. To investigate a more parsimonious model, the alternate one-factor model solution was also sought. Figure 1 presents the standardized parameter estimates for both models. The estimation of the two-factor model fit is: χ2 (N = 354, df = 19, p = 0.00) = 77.0; CFI = 0.95; and SRMR = 0.042. The twofactor model fits the data reasonably well according to accepted criteria. The estimation of the one-factor model fit is: χ2 (N = 354, df = 20, p = 0.00) = 156.1; CFI = 0.89; SRMR = 0.056. The fit statistics for the one-factor model are uniformly worse than those for the two-factor model and the CFI does not meet accepted criteria. Therefore, the two-factor model is more tenable. The two aspects of attitude measured by ASCI (V2) are expected to be related, though not redundant, which is supported by the two-factor CFA: the correlation between the “intellectual accessibility” and “emotional satisfaction” constructs is 0.80.

ACTM scores, and 354 had ASCI (V2) scores. The entire population was included in the correlation analysis. The relationships among these variables were summarized with correlation coefficients, as presented in Table 3, which in essence allows examination of a small nomological network. All relationships were positive as anticipated, and significant at the p < 0.05 level. The correlation between ACS and the other four variables is classified as moderate. The correlation between attitudeachievement (0.30 and 0.34) is consistent with extensive studies5860 and not as strong as the abilityachievement correlation (0.45 and 0.46). College admission decisions in the United States often consider test scores such as SATM as an indication of students’ ability, owing to high associations of these scores with students’ performance in college science courses. The SATM score has been used to predict first-year chemistry achievement.61 Because attitude toward chemistry and mathematical ability are conceptually distinct, it is reassuring that low correlations between constructs related to those different concepts are observed (0.19, 0.14, 0.13, and 0.15). Because two components of the same concept, attitude (i.e., cognition and emotion), should be related but not identical, the high correlation between these two variables from ASCI V2 (coefficient: 0.64) is also reassuring. Finally, as expected, SATM and ACTM, both intended to measure mathematics ability, reveal the highest correlation of all (coefficient: 0.74). In sum, all five variables correlate with each other as expected, and thus provide evidence for a small yet reasonable nomological network. Predictive Validity for Chemistry Achievement

Multiple regression analysis was performed to predict the students’ achievement in general chemistry (as captured by ACS exam scores) from ASCI (V2) and SATM scores. We expect that ASCI (V2) scores should account for a different portion of the variance in chemistry achievement than SATM scores. Three predictors were entered into an initial regression model: emotional satisfaction (ASCI V2), intellectual accessibility (ASCI V2), and mathematics ability (SATM). The maximum values of studentized residuals and Cook’s D were 3.3 and 0.048, respectively. These small values led us to believe no cases were exerting undue influence on the regression analysis. An examination of a scatterplot of the residuals with the predicted values revealed no violations of the linearity or homoscedasticity assumptions, and the distribution of the residuals was found to be approximately normal (skewness = 0.098, kurtosis = 0.025).

Nomological Network among Attitude, Ability, and Chemistry Achievement

There were 456 students enrolled in the discussion sections of the general chemistry I course, 383 had SATM scores, 292 had 565

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

Figure 1. Standardized parameter estimates for one-factor and two-factor models (N = 354). The large ovals designate the latent variables, the small circles report the residual variances, and the rectangles indicate the observed variables. Items were set to load on their assigned factors only. All factor loadings are significantly different from 0 (p < 0.05).

Table 3. Comparative Correlations among Attitude, Ability, and Achievement Dataa Correlation Values (Number of Students) Abilityb

Attitude Variable Intellectual Accessibility (ASCI V2)

Emotional Satisfaction (354)

SATM (383)

0.64 (354)

Emotional Satisfaction (ASCI V2) SATM ACTM

Achievementc ACTM (292)

ACS (456)

0.19 (297)

0.14 (223)

0.30 (354)

0.13 (297)

0.15 (223)

0.34 (354)

0.74 (261)

0.45 (383) 0.46 (292)

a Total N = 456. For all of these Pearson correlations, p < 0.05. b Values set in italics show the low correlation between attitude toward chemistry and mathematical ability measured on standardized tests (either SAT or ACT). c Correlation coefficient values pertaining to ACS exam scores are set in bold.

On the basis of our screening of the data, it appeared appropriate to proceed with the result analysis. The obtained R2 value was 0.286, suggesting about 29% of the variance in ACS Exam score was accounted for by the three predictors. From squared semipartial correlation of the predictors, SATM uniquely accounted for 17.7% of the variability in ACS score, emotional satisfaction uniquely 2.4%, and intellectual accessibility accounted for 0.3%. The regression coefficient for intellectual accessibility was found to not be significantly different from zero. Although intellectual accessibility and SATM both

correlate moderately with ACS score, SATM partially overlaps with the intellectual accessibility. This is a reasonable result, because intellectual accessibility of chemistry as perceived by general chemistry students is expected to be highly influenced by mathematical ability. Therefore, a more parsimonious second model was run with only emotional satisfaction and SATM as predictors. For the two-predictor model, the obtained R2 value was 0.283, suggesting about 28% of the variance in ACS Exam score was accounted for by the two predictors. The adjusted R2 was 0.278. 566

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

It appears that the model accounted for an acceptable proportion of the variability in ACS Exam scores. Cohen’s effect size, f2 = R2/(1  R2), was computed to be 0.40, which can be interpreted as a large effect. The root-mean-square error was 4.95, which indicated that predictions of ACS Exam scores from this model tend to be off by about 5 points and, like all regression results, cannot be regarded as exact predictions. The obtained prediction equation was:

ACTM, and ACS scores as expected, which provides evidence for a nomological network. Finally, once mathematics ability is taken into account, a component of attitude measured by ASCI (V2) plays a significant, unique part in predicting final achievement in a general chemistry I course. Although we do not expect our small study to end the controversy regarding the connection between attitude and achievement, we look forward to others’ use of ASCI (V2) for continued investigations of this open question. To our knowledge, this work represents the first time in the area of chemical education that a benchmark for scale development was successfully implemented to reconstruct an existing instrument. Starting from the 20-item ASCI, we deleted a redundant scale, reconstructed meaningful subscales, and estimated the data fit to theoretical models. Several methodological issues were clarified here. For example, confirmatory factor analysis was used as the most rigorous test for whether the internal data structure matches the conceptual framework. Additionally, factor scores were suggested for use when there is more than one factor among the measured data set, with only those items loading on the same construct used to determine the factor score. In general, we have tried to showcase ways in which evidence for reliability and validity of instrument scores can be provided so readers can interpret and apply research findings with a certain degree of confidence. This study contributes to chemical education research by providing validity evidence for a revised instrument, ASCI (V2), for college chemistry students. In addition to showcasing a method for examining validity, a second important result of this work is to recommend refinements that lead to greater ease of administration while improving validity: this eight-item instrument now takes very little time to administer. Because of its convenience and high psychometric quality, chemistry educators who are interested in determining students’ attitudes toward chemistry can easily find time to administer ASCI (V2), even to a large class, and can expect the results to be interpretable. We envision a significant potential use of ASCI (V2) will be to identify the effects of curricular reforms, via experimental or quasiexperimental designs. Using ASCI (V2) as a pre- and posttest for treatment and comparison groups would allow the effect of course experience on the attitude change of each student to be tracked. In addition, this type of study could provide further evidence for validity that we were not able to provide here, by exploring attitude change directly with the same sample of students on the same measure, and by exploring whether the two factors relate to curricular intentions in an expected way. It is not the case that ASCI (V2) captures all potential aspects of attitude, nor that a single measure is desirable. Exploration of attitude theory and past research practice involves both a variety of quantitative assessments (the online Supporting Information presents only a small subset) and of qualitative investigations. Understanding how ASCI (V2) fits into this complex set of possibilities can only be accomplished by further examining its relationships with other measures of attitude, achievement, and additional constructs such as learning strategies, motivation, and behavior. Ultimately, as chemical educators are increasingly asked to conduct assessment within our classrooms, we have a great need for easy-to-use instruments that yield valid, reliable, and readily interpretable scores for constructs of interest. We hope that, in the classes of our collective community, students’ attitudes toward chemistry are becoming more rather than less positive, and also that ASCI (V2) will help us all to investigate the truth of that proposition.

Predicated ACS Score ¼  2:29 þ 1:24Emotional Satisfaction þ 0:038SATM ð1Þ The regression coefficients are statistically significant. As the scale for each variable is different, the regression coefficients should be standardized for better comparison of their contributions to ACS Exam score prediction. Values of 0.25 and 0.44 were obtained for emotion and SATM, respectively. The intercept with mean-centered predictors is 23.5, so the predicted ACS score for a student with an average emotional satisfaction score and average SATM score is 23.5. SATM scores uniquely account for 18.8% of the variability in ACS Exam scores, while emotional satisfaction uniquely accounts for 6.16%. A two-predictor model using intellectual accessibility and SATM was also examined. SATM uniquely accounted for 18.0% of the variability in ACS score, while intellectual accessibility uniquely accounted for 4.05%. The obtained R2 value was 0.262, suggesting about 26% of the variance in ACS test was accounted by the two predictors, which is lower than the two-predictor model with emotional satisfaction and SATM. Comparisons of the three regression models suggest that a two-predictor model based on emotional satisfaction and SATM represents the best model to predict ACS Exam score for this data. In terms of predictive validity, the emotional satisfaction score from ASCI (V2) can be used to improve predictions of chemistry achievement that would result from the use of a predictor of mathematical ability alone, which supports the idea that emotional satisfaction is a distinct measured construct.

’ SUMMARY AND CONCLUDING DISCUSSION Starting with the original ASCI (V1) developed by Bauer, we designed and implemented a process of scale development to refine the instrument for better construct validity. Better construct validity results in greater ease of interpretation, which we demonstrated via an examination of a nomological network and predictive validity for ASCI (V2). The research goals were successfully achieved, as evident in the previous analysis. Research goal 1 was to evaluate data quality for ASCI (V1) for our sample and compare our data structure with literature. The results showed a similar data pattern with the literature, and the construct validity of ASCI (V1) scores was not supported by confirmatory factor analysis. Research goal 2 was to evaluate data quality for ASCI (V2). ASCI (V2) was designed to measure two subscales of attitude: intellectual accessibility and emotional satisfaction. These two subscales are indicators of cognition and of affect, respectively, and thus make a good connection with the conceptual framework for attitude from psychology. Evidence for construct validity was obtained. The two-factor correlated structure was upheld by confirmatory factor analysis. Regarding correlations among attitude, achievement, and ability, attitude correlates with SATM, 567

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568

Journal of Chemical Education

ARTICLE

’ ASSOCIATED CONTENT

(24) Allport, G. W. In Handbook of Social Psychology; Murchison, C. M., Ed.; Oxford University Press: London, 1935; p 798844. (25) Rosenberg, M. J.; Hovland, C. I. In Attitude Organization and Change: An Analysis of Consistency among Attitude Components; Hovland, C. I., Rosenberg, M. J., Eds.; Yale University Press: New Haven, CT, 1960; p 3. (26) Eagly, A. H.; Chaiken, S. The Psychology of Attitudes; Harcourt Brace Jovanovich: Fort Worth, TX, 1993. (27) Eagly, A. H.; Chaiken, S. Soc. Cognit. 2007, 25, 582–602. (28) Bagozzi, R. P.; Burnkrant, R. E. J. Pers. Soc. Psychol. 1979, 37, 913–929. (29) Breckler, S. J. J. Pers. Soc. Psychol. 1984, 47, 1191–1205. (30) Park, R. E.; Burgress, E. W. Introduction to the Science of Sociology; Univ. Chicago Press: Chicago, 1924. (31) Bandura, A. Social Learning Theory; Prentice-Hall: Oxford, England, 1977. (32) Bem, D. J.; McConnell, H. K. J. Pers. Soc. Psychol. 1970, 14, 23–31. (33) Corno, L.; Mandinach, E. B. Educ. Psychol. 1983, 18, 88–108. (34) Festinger, L. A Theory of Cognitive Dissonance; Stanford University Press: Stanford, CA, 1957. (35) Fishbein, M.; Ajzen, I. Psychol. Rev. 1974, 81, 59–74. (36) Doran, R.; Lawrenz, F.; Helgeson, S. In Handbook of Research on Science Teaching and Learning; Gabel, D., Ed.; Macmillan: New York, 1994; p 388442. (37) Gardner, P. L. Stud. Sci. Educ. 1975, 2, 1–41. (38) Koballa, T. R. J.; Crawley, F. E. Sch. Sci. Math. 1985, 85, 222–232. (39) Ajzen, I. Organ. Behav. Hum. Decis. Process. 1991, 50, 179–211. (40) Barbera, J.; Adams, W. K.; Wieman, C. E.; Perkins, K. K. J. Chem. Educ. 2008, 85, 1435–1439. (41) Bauer, C. J. Chem. Educ. 2005, 82, 1864–1870. (42) Bauer, C. J. Chem. Educ. 2008, 85, 1440–1445. (43) Chatterjee, S.; Williamson, V. M.; McCann, K.; Peck, M. L. J. Chem. Educ. 2009, 86, 1427–1432. (44) Lacosta-Gabari, I.; Fernandez-Manzanal, R.; Sanchez-Gonzalez, D. J. Chem. Educ. 2009, 86, 1099–1103. (45) Leavers, D. R. J. Chem. Educ. 1975, 52, 804–804. (46) Lewis, S. E.; Shaw, J. L.; Heitz, J. O.; Webster, G. H. J. Chem. Educ. 2009, 86, 744–749. (47) Walczak, M. M.; Walczak, D. E. J. Chem. Educ. 2009, 86, 985–991. (48) Grove, N.; Bretz, S. L. J. Chem. Educ. 2007, 84, 1524–1929. (49) Cooper, M. M.; Sandi-Urena, S. J. Chem. Educ. 2009, 86, 240–245. (50) Blalock, C. L.; Lichtenstein, M. J.; Owen, S.; Pruski, L.; Marshall, C.; Toepperwein, M. A. Int. J. Sci. Educ. 2008, 30, 961–977. (51) Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Erlbaum: Hillsdale, NJ, 1988. (52) Cudeck, R. Psychol. Bull. 1989, 105, 317–327. (53) Cheng, S. T.; Chan, A. C. M. Educ. Psychol. Meas. 2003, 63, 1060–1070. (54) Hu, L.-T.; Bentler, P. M. Struct. Equation Model. 1999, 6, 1–55. (55) Cronbach, L. J.; Meehl, P. E. Psychol. Bull. 1955, 52, 281–302. (56) ACS Examinations Institute, ACS Division of Chemical Education. http://chemexams.chem.iastate.edu/ (accessed Feb 2011). (The test administered dated from 2005; at that time the ACS Examinations Institute was headquartered at the University of WisconsinMilwaukee.) (57) Osborne, J. W.; Waters, E. Pract. Assess. Res. Eval. 2002, 8. (58) Germann, P. J. J. Res. Sci. Teach. 1988, 25, 689–703. (59) Reynolds, A. J.; Walberg, H. J. J. Educ. Psychol. 1992, 84, 371–382. (60) Salta, K.; Tzougraki, C. Sci. Educ. 2004, 88, 535–547. (61) Lewis, S. E.; Lewis, J. E. Chem. Educ. Res. Pract. 2007, 8, 32–51. (62) Munby, H. An Investigation into the Measurement of Attitudes in Science Education; SMEAC Information Reference Center: Columbus, OH, 1983.

bS

Supporting Information Discussion of our review of instruments; details (descriptive statistics, exploratory factor analysis, correlation analysis) regarding our quality evaluation of the ASCI (V1) scores; a full description of our stepwise scale reconstruction process, including the final ASCI (V2) instrument; data screening process for the instruments included in the nomological network. This material is available via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected].

’ ACKNOWLEDGMENT This material is based upon work supported by the National Science Foundation under Grant No. DUE-0817409. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. ’ REFERENCES (1) Rubba, P. A.; Anderson, H. O. Sci. Educ. 1978, 62, 449–458. (2) AAAS Project 2061—Science for All Americans; AAAS: American Association for the Advancement of Science: Washington, DC, 1989. (3) Gouveiaoliveira, A.; Rodrigues, T.; Demelo, F. G. Med. Educ. 1994, 28, 501–507. (4) Hren, D.; Lukic, I. K.; Marusic, A.; Vodopivec, I.; Vujaklija, A.; Hrabak, M.; Marusic, M. Med. Educ. 2004, 38, 81–86. (5) Parke, H. M.; Coble, C. R. J. Res. Sci. Teach. 1997, 34, 773–789. (6) Dahle, L. O.; Forsberg, P.; Svanberg-Hard, H.; Wyon, Y.; Hammar, M. Med. Educ. 1997, 31, 416–424. (7) Custers, E.; Ten Cate, O. T. J. Med. Educ. 2002, 36, 1142–1150. (8) Anthony, S.; Mernitz, H.; Spencer, B.; Gutwill, J.; Kegley, S.; Molinaro, M. J. Chem. Educ. 1998, 75, 322–324. (9) Ault, C. R. J. Res. Sci. Teach. 1998, 35, 189–212. (10) Freedman, M. P. J. Res. Sci. Teach. 1997, 34, 343–357. (11) Laws, P. W.; Rosborough, P. J.; Poodry, F. J. Am. J. Phys. 1999, 67, S32–S37. (12) Ozmen, H. Comput. Educ. 2008, 51, 423–438. (13) Paris, S. G.; Yambor, K. M.; Packard, B. W. L. Elem. Sch. J. 1998, 98, 267–288. (14) Romance, N. R.; Vitale, M. R. J. Res. Sci. Teach. 1992, 29, 545–554. (15) Shymansky, J. A.; Yore, L. D.; Anderson, J. O. J. Res. Sci. Teach. 2004, 41, 771–790. (16) Steele, D. J.; Medder, J. D.; Turner, P. Med. Educ. 2000, 34, 23–29. (17) Downey, D. B.; Ainsworth, J. W.; Qlan, Z. C. Sociol. Educ. 2009, 82, 1–19. (18) Mattern, N.; Schau, C. J. Res. Sci. Teach. 2002, 39, 324–340. (19) Mickelson, R. A. Sociol. Educ. 1990, 63, 44–61. (20) Turner, J. C.; Midgley, C.; Meyer, D. K.; Gheen, M.; Anderman, E. M.; Kang, Y.; Patrick, H. J. Educ. Psychol. 2002, 94, 88–106. (21) Usak, M.; Erdogan, M.; Prokop, P.; Ozel, M. Biochem. Mol. Biol. Educ. 2009, 37, 123–130. (22) Weinburgh, M. J. Res. Sci. Teach. 1995, 32, 387–398. (23) Zanna, M. P.; Rempel, J. K. In The Social Psychology of Knowledge; Bar-Tal, D., Kruglanski, A. W., Eds.; Cambridge University Press: New York, 1988; p 315334. 568

dx.doi.org/10.1021/ed900071q |J. Chem. Educ. 2011, 88, 561–568