Toward More Performance Evaluation in Chemistry - Journal of

The history of the author's experiences in testing and changes in evaluation ... of citations to this article, users are encouraged to perform a searc...
1 downloads 0 Views 74KB Size
In the Classroom edited by

View from My Classroom

David L. Byrum Flowing Wells High School Tuscon, AZ 85716

Toward More Performance Evaluation in Chemistry Sharon L. Rasp Kirkwood High School, Kirkwood, MO 63122 “Pig!” he screamed through the car window as he honked the horn and swerved madly on the road, side to side. As she started into the upcoming curve, the astounded woman thought, “That man is calling me a pig, and he’s driving like one!” Rounding the curve, she ran right into a pig in the middle of the road. I still remember this anecdote from a priest years ago. He used it as an analogy to illustrate that things are not always what they seem. He took the familiar (who has not had this happen at least once—being unjustly accused of being a road hog?) and made it strange. My journey from the familiar and “normal” manner of giving tests to the strange way I do it now has been long and difficult; however, the rewards and discoveries along the path have made it worthwhile. Testing: A Brief History and the Confusing Vocabulary The traditional view of testing involves a paper and pencil instrument and mostly multiple choice, fill-in-theblank, short-answer questions—and this is exactly what my old tests were like. Assessment in education has changed dramatically over the last ten years (1). Accompanying this change is the growing diversity of the United States citizenry and therefore the student populations. Cultural differences in concept attainment and learning have been recognized, with particular emphasis on science education, by Erickson (2). Erickson contends that the culture and language a student is exposed to directly affects how that student will learn. Furthermore, his study found that even students considered “illiterate” were capable of quite abstract thinking—the type of thinking necessary in studying sci-

ence. The move toward learning outcomes beyond the knowledge–comprehension level makes multiple choice methods inadequate in judging performance and application of science concepts (3). Too much emphasis is also being placed upon the cognitive domain—evaluation also needs to address skill development and coordination, as well as the affective domain (4). A move toward including performance evaluation may in part respond to these concerns. In the fall of 1991, I attended the Fall Forum for the Coalition of Essential Schools (founded by Ted Sizer) in Chicago. Quite a few suggestions were given for checklists and portfolios to evaluate student performance. I believed that these would be impractical for grading a large number of students, but the idea of performance evaluation took a firm root. My teaching assignment at the time was a non-collegebound, consumer-level chemistry course. I resolved that, no matter what, my tests were going to involve the students actually doing chemistry. Why should the students spend two to three days each week in laboratory situations, only to take the old format of tests? Admittedly, lab practicals have been around for years, especially in the life sciences, but they have taken a back seat to pure paper-and-pencil tests. Before going any further, let me clarify what I mean by the different terms used to mean “test”. Evaluation, testing, and assessment have been used interchangeably, and while arguably minute differences do exist, for the purpose of this article, their meanings are the same. What I do wish to clarify is the difference between “performance” and “authentic” assessment/evaluation/testing. Performance involves the teacher evaluating what a student actually does in an assessment situation, whereas authentic addresses more true-to-life situations that a student might encounter

Sharon Rasp has taught science for 15 years—5 years in Seattle, Washington, in the Shoreline School District at Shorewood High School, and 10 years at Kirkwood High School in Kirkwood, Missouri. Courses she has taught include biology, chemistry, physics, and physical/life science. She graduated from Washington State University in 1980 with a BS in biology and a minor in physical science. Her graduate studies included coursework in physical chemistry, biochemistry, and veterinary microbiology/pathology. She obtained a Masters of Education in curriculum and instruction from the University of Missouri at St. Louis, and she is currently in the doctoral program in the area of educational research, evaluation, and measurement. In 1985, Rasp was nominated for Outstanding Science Teacher of the Year in Washington State, and she was also selected to participate in the NSTA Teacher Honors Workshop at AT&T Bell Labs in Murray Hill, New Jersey. In 1992 she participated in the Earth and Atmospheric Science Institute at the University of Missouri at Rolla, and received a Citicorp Success Fund Innovator Award for her foods unit, “You Are What You Eat”. After serving 5 years as the Kirkwood High School Science Department Head, Rasp returned this year to her first love—teaching full time in the science classroom. When not teaching or studying, she enjoys playing tennis and spending time with her two cats, Malcolm and Martin.

64

Journal of Chemical Education • Vol. 75 No. 1 January 1998 • JChemEd.chem.wisc.edu

In the Classroom in the real world. An example of performance assessment would be observing students lighting a Bunsen burner. An authentic assessment would be asking students to analyze food package labels and determine which product is more healthful based on nutritional content. Mechanics of Change In moving toward more performance evaluation, I started very small. For my first exam I had the students do the singular task of massing out specified amounts of salt. This task was in addition to the regular multiple choice, fillin, and short answer question sections of the test. I was amazed at the results—very few were correct on the first try. Students kept including the mass of the plastic baggy in their calculations! Also, there were only four balances available for students to use at one time. I had to set them up around the room and allow the students to walk around during the test—certainly not a sterile, quiet testing atmosphere. Beyond these problems was the time factor. I had to continue the test into the next day, and of course this meant students could go back and study things and change them on the test—completely against what I had always thought a test should be. Another reason I became so committed to continuing performance assessment centered around how my students reacted to the set of laboratory activities that followed their first performance assessment experience. Students asked me about the specific tasks they would be responsible for on the next test. Furthermore, students demanded their lab partners allow them to practice the indicated tasks—“Let me do it, so I know how for the test!” I became an addict— every test had to have lab portions, and they grew into three- and four-day events. When I started teaching the college-bound chemistry course at our school, I reverted to the old testing method, only to find myself dying to test the new way. So I did—again, with a positive response from my students. Performance Evaluation in My Classroom Today The forensic unit of our school’s Practical Chemistry course illustrates what my performance evaluation looks like now. During this unit the students study document analysis (pen chromatography, paper analysis, secret messages), molds and casts, fingerprinting, and types of evidence. There is still a multiple choice section of the test (computer-scanned) which covers basic content. In addition the students are expected to analyze paper samples, drawing a data table and making conclusions regarding their observations. They are given four secret messages to analyze and determine how each message was written and what is necessary to “develop” the message. For example, a hidden message written with a cotton-tipped swab dipped in sodium hydroxide can be “developed” with phenolphthalein. Students are given a plaster cast of a tennis shoe print and are asked to match the cast with its source. This section and its results are an excellent illustration of the value of performance evaluation. In doing the laboratory exercise during the unit, the students all seemed to understand the concept of molds and casts—that the mold would essentially be the “reverse” or mirror image, and the cast would be a replica of the shoe. I discovered with the performance test that students would still match the casts to incorrect mates. Upon pointing out their error, I was almost always told, “Oh, I get it now—I guess I didn’t really have it the first time.” The remainder of the test includes a fill-in section, where

students identify fingerprint types, and a short-answer portion on organic chemistry nomenclature. It is obviously not possible for the students to finish the test in one day. They are given a test folder containing all parts of the test. Students are permitted to work on the parts of the test in whatever order they wish, the only limitation being the number of lab stations available for lab work (I usually have two to six available, and this is another sales point to use microscale whenever possible). I have found that giving students the power to choose the order of test parts allows them some sense of comfort and control. The lab portions are still evaluated on the basis of paper and pencil, but the student must perform the lab to obtain results. Ideally, performance assessment is done with the teacher marking a checklist while observing the student during the procedure. In reality, there is not enough time, and therefore I grade the written response as an evaluation of the student’s lab technique. This is not as accurate as direct observation, but arguably more informative than a multiple choice evaluation. Testing Philosophy and the Impact of Change Changing how I test has altered my philosophy radically. Because students had access to all test parts, as time went on I found myself encouraging students to look over their notes, text, worksheets, and lab write-ups from the unit as homework between test days. I give them a review sheet outlining multiple choice, fill-in, and short-answer item content, as well as a list of lab procedures for which they will be responsible. Because of the large amount of science information, I do allow students to use notes. They can only write on a single sheet of notebook paper in their own handwriting (computer files are too easily shared). This forces them to organize their studying, and the privilege is earned by turning in all assignments (I have fewer missing assignments from students with this as an incentive). Use of notes is admittedly controversial to traditional test givers—after all, students do not learn the material as well as when they have to memorize it. I would argue that my students have better things to do with their time and available brain space than to memorize definitions, formulas, or lists. Furthermore, because I let students use notes and go back and study things overnight during a multi-day test, very specific information and ideas are fair game on tests. Students who work hard and study are thus rewarded. You may be asking why do I bother to include the multiple-choice section. Two reasons: (i) different students learn differently, so our testing should address a variety of learning styles; and (ii) whether we like it or not, multiple choice tests are still a fact of life for our students (SATs and driver’s tests), and therefore, they need the practice. There is one more item which many traditional teachers find unacceptable. My students are encouraged to retake any part of the test (with a small point deduction for secret messages and other items that take a significant amount of teacher preparation, discouraging an all-out “retake assault”). This allows students who finish quickly to go back over portions with which they had difficulty. Students who need more than the allotted class time are given before and after school to finish or retake parts. This does mean that alternate versions are needed for different parts, but now with word processing this is usually a minor obstacle. If our goal in education is for students to learn, does it really matter that some students take more time to perform acceptably on a test?

JChemEd.chem.wisc.edu • Vol. 75 No. 1 January 1998 • Journal of Chemical Education

65

In the Classroom How Performance Assessment Has Affected Student Grades Test data collected for five years have yielded noteworthy, although not statistically significant, results. In general, Caucasian students’ scores have been lower for lab performance exams than their multiple choice test scores, whereas African American students’ scores tend to be higher for lab performance exams. These are the only results examined thus far. There may be many reasons for these results, but I believe the most important point to be gleaned from the data is the need for a variety of ways to assess student learning. Suggestions for Those Choosing To Boldly Go Anyone who has taught with me will tell you that I am a very random person, and so this variety of testing actually thrills me, but it may seem like a nightmare to others. Because testing takes several days, I only test this way four or five times a year, with quizzes in between. I am in no way suggesting you do it “just like me”. Figure out what your philosophy is, and start out small. Perhaps it is easier for you to assess individual students along the way. I do this with written lab reports, but I do not think they tell me how well each student understands the concepts or applies knowledge. One thing to try is having students demonstrate the correct use of a balance by measuring out a given amount of salt. I use zipper-type plastic bags, which are fairly consistent in mass. The student returns the bag with the measured amount, and the instructor measures the mass, taking into account the mass of the bag. Another possibility is energy measurement. If you have had your students calculate calories by burning peanuts, cheese puffs, or marshmallows (a measured mass of food is burned under a can with 100 mL water, and temperature and mass changes allow calculation of calories per gram), try having them burn another type of food sample for the test. I have given them cone-shaped corn snacks (both the low-calorie and regular calorie versions), and asked them to determine which is which from their lab data. For molarity testing, supply the students with an unknown saltwater sample. The water can be boiled or evaporated off, and with the appropriate mass measurements taken, molarity can be calculated. Starting to use performance evaluation proved to be formidable and scary, but I have found far too many rewards to stop now. My students pay more attention during laboratory work. Students who hate this testing the first time they go through it demand the format for the remainder of the school year. I have a more accurate view of who really understands and who needs more work. For further ideas, research findings, and information on performance assessment, I have compiled a list of suggested readings. Including performance evaluation in your science testing has merit and can be rewarding to both you and your students.

66

Acknowledgments Thanks to my chemistry colleagues at Kirkwood High School—Stephanie Robert, Beth Bock, and Bob Becker—for creative ideas and feedback. Thanks also to Burak Taysi for his review and comments. Literature Cited 1. 2. 3. 4.

Stiggins, R. J. Appl. Meas. Ed. 1991, 4(4), 263–273. Erickson, F. Urban Rev. 1986, 18, 117–124. Torrance, H. Ed. Policy Anal. 1993, 15, 81–90. McComas, W. E. Ed. Urban Soc. 1989, 22, 72–82.

Suggested Readings 1. Aschbacher, P. R. Performance assessment: State activity, interest, and concerns. Appl. Meas. Ed. 1991, 4, 275–288. Discusses assessment as being a potent tool, but the costs, logistics, and other concerns are obstacles. 2. Brandt, R. On performance assessment: A conversation with Grant Wiggins. Ed. Leadership 1992, May. Grant Wiggins is acknowledged by many as the “assessment guru”; he answers several questions regarding assessment. 3. Briscoe, C. Using cognitive referents in making sense of teaching: A chemistry teacher’s struggle to change assessment practices. J. Res. Sci. Teach. 1993, 30, 971–987. Briscoe discusses one teacher’s struggle to understand assessment, his role as assessor, and related problems of implementing change in the chemistry classroom. 4. Cheek, D. W. Evaluating learning in STS education. Theory Into Practice 1992, 31, 64–72. Cheek describes different types of assessment with respect to science, technology and society. 5. Doran, R. L. et al. Alternative assessment of high school laboratory skills. J. Res. Sci. Teach. 1993, 31, 1121–1131. The study focuses on developing and validating instruments to assess laboratory skills of high school science courses. 6. Roth, W.; Roychoudhury, A. The development of science process skills in authentic contexts. J. Res. Sci. Teach. 1993, 30, 127–152. The authors find in the study that students develop higher-order process skills when performing experiments in authentic contexts. 7. Shavelson, R. J.; Baxter, G. P. What we’ve learned about assessing hands-on science. Ed. Leadership 1992, May. This article cautions that, unless carefully constructed, assessments alone are not likely to improve achievement. 8. Shavelson, R. J.; Baxter, G. P. Performance assessment in science. Appl. Meas. Ed. 1991, 4, 347–362. The authors present and apply guidelines for developing performance assessments aligned with alternative assessment research and reform in science. 9. Wiggins, G. Creating tests worth taking. Ed. Leadership 1992, May. Wiggins gives a list of key questions test writers can use in designing assessments.

Journal of Chemical Education • Vol. 75 No. 1 January 1998 • JChemEd.chem.wisc.edu