Education
Educators assess public knowledge of science Current study, third to be made since NAEP program began, will probe attitudes toward interface of science, persistent societal problems If present trends continue, the general public more and more will become involved directly or indirectly with scientific-technological issues—as with nuclear power and recombinant DNA. Just what the coming adult generation knows about such issues and what its basic level of knowledge of science is will be determined this school year by the National Assessment of Educational Progress (NAEP). And in addition to NAEP's regular large-scale assessment, a special probe to be administered this spring will measure knowledge and attitudes concerning energy. The current assessment of public knowledge of science is the third to be made since NAEP's program began. A second assessment, in 1972-73, revealed an average decline of 2 to 3% from the first assessment in 1969-70. Identical test items were used in the first two assessments and resulted from objectives for the assessments developed earlier. For the third assessment, objectives differ considerably. Details of the current assessment and its place in the NAEP program were provided in Denver late last month at the 143rd Annual Meeting of the American Association for the Advancement of Science. Norris C. Harms of the Bureau of Educational Field Services at the University of Colorado notes that perhaps the audience needing to be most concerned about the general public's understanding of science is the scientific community itself. "As decisions regarding specific research and technological developments are being made to a greater and greater extent in the public arena," he says, "we must be concerned if the citizens who will be involved in making those decisions in the very near future are perhaps even less well prepared than are the citizens who are now grappling with such issues." To determine whether or not they are less well prepared—or better prepared— is the overall goal of NAEP. NAEP stems from initiatives taken in the early 1960's. As a result of a series of conferences organized by the U.S. Office of Education, a group of educators and concerned persons in 1964 formed the Exploratory
Committee on Assessing the Progress of Education (ECAPE). That committee, funded by Carnegie Corp. and the Ford Foundation, explored the problems of an assessment, developed a detailed plan for conducting it, and constructed the first assessment test items. In 1969—the year of the first assessments, in science, writing, and citizenship—the Education Commission of the States assumed governance of the project, which was then renamed NAEP, and USOE assumed funding responsibility. In 1971, funding and monitoring responsibility shifted to the National Center for Education Statistics. The Education Commission of the States is a compact of 43 member states and territories for political and professional cooperation in planning the future of U.S. education. NAEP differs considerably from the usual achievement testing. The assessment, NAEP's Robert C. Larson explains, estimates the percentages of nine-, 13-, and 17-year-olds, and young adults (ages 26 to 35) who can respond to a question. The data are based on national probability samples—rather than on samples of those electing to buy a certain test or those included in some special program. The test items are readministered periodically to the same age groups, not to the same people. And, as distingushed from local or state curriculum development projects, assessment objectives and the test items measuring them are based on a national consensus of what is considered important by scholars, accepted as an educational task by schools, and considered desirable by thoughtful citizens. So far, two assessments have been performed for science, writing, reading, citizenship, mathematics, and social studies, and one each for literature, music, career and occupational development, and art. Each assessment begins well before it's given with development of the objectives. Thus, objectives developed for science in 1965 resulted in the test items used in 1969-70 and in 1972-73, which provided data on the change occurring from one assessment to the next. Even before the 1969-70 baseline data were available, NAEP began redeveloping the science objectives that would result in test items measuring change between 1972-73 and 1976-77. In 1974, science objectives were addressed again in a look forward to test items that would measure change between 1976-77 and some future year. Norris Harms notes that there are several significant differences between the objectives and test items of the current assessment and those of the first two. For one thing, whole new categories have been
developed to assess knowledge and attitudes concerning the interface of science, technology, and the persistent societal problems of population, environment, energy, and the like. There also is considerable emphasis on the affective domain, which addresses such variables as attitudes toward science classes and attitudes toward science teachers. One general trend across the three assessments, Harms says, has been toward a broader definition of science, and therefore a larger domain to be measured. The general area of process—skills such
Science assessment has nine cognitive categories... Emphasis, %
Biology Physical science Earth science Multidisciplinary science Processes and methods of science Science and societal problems Science and self Science and technology Decision making
15% 15 10 10 18 12 7 7 6
. . .with content chosen on the basis of these criteria • Contributes substantially to the understanding of the nature of a subject area. • Entails a key concept for large bodies of knowledge. • Has broad application beyond the curricula area. • Is personally relevant and applicable—contributes to the survival, wellbeing, and quality of life of the individual. • Is useful in potential career preparation. • Is likely to show changes in achievement from emerging trends in science education. • Contributes to an effective base for decision making at all levels of society. • Contributes to the understanding of self. • Contributes to understanding the nature, potential, and limitations of science. • Produces data that have potential impact on educational policy.
March 14, 1977 C&EN
23
Learn from the leaders -in person
I as observation, measurement, design of experiments, interpretation of data, I classification, and others—received considerably more attention in the second assessment than in the first and is getting still more attention in the third. Another trend across the three assessments, Harms says, has been one toward higher cognitive levels of the test items—more emphasis on application of principles and on analytic skills. Three distinct components make up the 1976-77 science objectives, Harms explains. One is a general matrix that defines the domain of the assessment and breaks it up into nine cognitive areas, affective areas, and an experiences area. The second is a set of criteria generally reflecting various goals and philosophies The live courses taught by of science educators and providing the renowned authorities basis for development of test items. Third is several hundred objective sample test items representing the various parts of the matrix and written such that test items can be developed from them. Describing the nine cognitive categories, Harms explains that the multidisciplinary science section includes test items measuring understanding of unifying concepts. The science-and-self section includes those topics in health and safety At the 1977FASEB education that are most likely to be Meeting, Chicago, Illinois. taught in science classrooms. The science The courses will be held and technology section includes test items April 2 - 3 . measuring knowledge of the contributions technology has made to quality of life and of future promise and danger of various technological endeavors. The decisionD THIN LAYER making area measures students' abilities CHROMATOGRAPHY to apply formal decision-making skills, such as definition of problems, identification of constraints, identification of D CHROMATOGRAPHIC goals, and identification of information MAINTENANCE AND needed to reach a decision. TROUBLESHOOTING Harms points out that one of the major WORKSHOP goals of science education, especially at the elementary level, is to provide for D AUTOMATED students' concrete experiences with the BIOANALYSIS natural world and hands-on investigation of the natural world. Thus, a section is included in the assessment to ask stuD LABORATORY dents whether they have had various AUTOMATION: MICRO-, kinds of experiences—for example, MINI-, OR MIDICOMPUTERS whether they have ever done experiments with seeds, or carried out activities with magnets, batteries, bulbs, or laboratory balances, or taken their own temperatures or their ow7n pulses. D TOXICOLOGY FOR Perhaps of special interest to the sciCHEMISTS entific community are the affective secApril 19-21, 1977 tions in the assessment. The eight areas, Washington, D.C. Harms explains, include: • Attitudes toward science classes: Do students think science classes are enjoyable, useful, individualized, and do students engage in extracurricular activiFor brochures giving full details ties? on the courses you are inter• Vocational and educational intenested in, write: tions: Do students plan more science- or Department of Educational Activities technology-related study, and do they plan to enter a science-related field? American Chemical Society • Personal involvement: Are students 1155 Sixteenth Street, N.W. involved in science-related societal Washington, D. C. 20036 problems such as environment, energy, or or CALL COLLECT disease prevention? (202) 872-4508 • Tools-attributes: Do students utilize
ACS Intensive Short Courses CHECKLIST of upcoming courses
24
C&EN March 14, 1977
scientific skills and attributes in daily life? • Confidence in science: Do students have faith in the ability of science and technology to solve problems? • Support of research: Do students think scientists should be given resources to do basic and applied research? • Controversial issues: Where do students stand on work in areas such as genetic manipulation, cloning, or space exploration? • Awareness: Are students aware of the empirical nature of science, for example? The working out of the plan for the assessment was an involved affair. The cognitive matrix and list of criteria were put together in the summer of 1974 by a group of 10 science educators. Included in the group were representatives of the National Science Foundation, the National Science Teachers Association, the National Association of Research in Science Teaching, the National Association of Biology Teachers, the American Association for the Advancement of Science, a curriculum developer, and a science coordinator of a large suburban school district. Following development of the matrix and criteria, groups of science educators in each of the cognitive areas were convened to develop the specific objective samples that would lead to development of test items. Two additional meetings dealt specifically with objectives for nine-year-olds. All told, more than 50 people were involved in development of objectives, and many of them, along with about 50 others, were involved in the later development of specific test items. NAEP's overall assessment design, Robert Larson explains, can be characterized as "a deeply stratified three-stage probability sample design with selection probabilities proportional to estimated size of the sampling units at the first two stages." At the first stage, NAEP selects a deeply stratified sample of primary sampling units (PSU's), each having geographic boundaries that correspond to one or more contiguous county boundaries. At the second stage, schools are selected within each PSU, with the constraints that schools cannot be sampled more than once every four years and every state must be included at least once in every four years. At the last stage, students are selected with equal probability within the selected schools. PSU's are selected through a complex procedure. Essentially, the first stratification of PSU's is by geographic region. Within regions, PSU's are classified into five size-of-community categories. The number of PSU's, schools within PSU's, and students within schools, Larson explains, is determined by optimum sampling principles. Subject to constraints, NAEP tries to select a design that will maximize precision of estimates per unit costs. Thus, the 1976-77 science assessment requires 77 PSU's, about 1500 schools, and about 80,000 students, each taking one of 31 45-minute packages of test items. D