What do chemistry professors think about evaluation of instruction?

3) Ratings based upon classroom visits by colleagues and admin- istrators ..... 5) Intelligently designed evaluation programs (taken seriously, but no...
0 downloads 0 Views 3MB Size
M. E. Schaff and B. R. Siebring University of Wisconsin-Milwaukee Milwaukee, Wisconsin 53201

I

1

What DO Chemistry Professors Think about Evclluation of hstrudion?

The volume of published research and opinion on the evaluation of instruction attests to the considerable interest in this topic. As mentioned in the preceding paper, however, almost none of the published material is specifically about chemical education or by chemistry instructors. In order to learn something about the degree to which evaluation of instruction is going on in chemical education and to obtain the opinions of some chemical educators on the validity of the various methods of evaluation, we contacted 200 chemistry professors by questionnaire. The list of 200 consisted of 100 professors who were also department chairmen, and 100 professors who were also authors of general chemistry textbooks. Chairmen were selected because presumably they would have had experience in conducting evaluations or in attempting to utilize the resulting data. General chemistry textbook authors were selected because such authors are generally individuals with considerable experience in teaching. These two groups, it was felt, would provide the opinions both of those being evaluated, and of those who might use the data for salary and oromotion recommendations. Of the 200 questionnaires sent out, 88 were returned promptly. An additional 57 replies were received after a follow-up letter was mailed. A total of 129 replies provided useful information; of these, 76 were from department chairmen, 53 were from authors. (These two groups are referred to hereafter simply as "chairmen" and "professors.") he questionnaire provided a list of nine common methods of gathering information about instruction, reouested the additional listine of other suitable methods. and then asked several questyons about the methods. The nine originally listed were 1) Student ratings hased upon objectives, organization, snd presentation of the course, evaluation (grading) methods, friendliness of the instructor, ete. 2) Former student ratings hased upon objectives, organization, and presentation of the course, evaluation (grading) methods, friendlinessof the instructor, etc. 3) Ratings hased upon classroom visits by colleagues and administrators 4) Comparison of pre- and past-course testing results to ascertain how much the student has learned by taking the course 5) Comparison of the performance on departmental examinations of students who have received their instruction in the same course from different instructors 6) Comparison of the performances of students in advanced courses, whose backgrounds are the same or similar, but who have had differentinstructors 7) Critical examination of textbooks, syllabi, examinations, and other pedagogical material used in the course 8) Critical examination of pedagogical material published hy the instructor (textbooks, laboratory manuals, study guides, articles in Journal of Chemical Education, etc.) 9) Evaluation of the participation of the instructor in committee work related to instruction

12) Innovations in instruction 13) Attendance at institutes 14) Informal information gathering 15) How well he teaches other faculty members 16) Subsequent success of students 17) Letters from alumni (Dean requests) 18) Letters or rating forms from faculty 19) Letters from students 20) Self-evaluation by instructor 21) Evaluation by colleagues 22) ~ a t i n g of s instructor by teaching assistants Many of these are essentially modifications of the nine already listed, and most were suggested only once. Only three were, in fact, mentioned more than once. They were 101 Scores on nationally standardized rests 111 Interviews with seniorsand graduarestudents 181 Lcrrers or rating forms from faculty The questions which were asked of the chairmen and professors, together with summaries of answers received, follow. Question a. Which of the aboue methods or combination of the above methods houe you used regularly and systematieolly (that is, your erperienee amounts to more than casual obseruation)? Table 1 presents the results from this question. Only 7 of the 129 replied that they had had no experience with any method of evaluation. It should be remembered, of course, that only 2 out of 3 of the persons to whom we sent questionnaires responded. The follow-up letter which was sent to those who did not respond to the first mailing said in part, "Please do not feel that if your institution has not used systematic evaluation procedures that this information would not be of use to us. This would he a very important piece of information." In spite of this it may be that many among the 35% who did not respond chose not to do so because they had had little experience with evaluation. Eighty-seven percent replied that they had used student ratings regularly and systematically. Contrasted with this, only two other methods (former student ratings and classroom visitations) were or had been in use by one in five of the respondents. When replies and their corresponding percentages were hroken into the two groups, professors and chairmen, noticeable differences were observed with respect to three methods only, namely 2, 5, and 7. Two of these, methods 2 (former student ratings) and 7 (examination of pedagogical material used), must, by their very nature, be carried on outside of the classroom, often by an administrator. Thus the fact that department chairmen indicated more experience with these methods than did the professors is understandable.

-

Totals None

An additional 13 methods were listed by respondents. These were 10) Scoresun natwnallv srandnrdirrd tests 111 Interviews w l h senior>nnd manuate irudrnts

152

1 Journal of Chemical Education

-

Table 1. Ouestion a

AN

% of All Replies Chairmen % of Chairmen Professors % of Prof-ra

129 76 53

7 5 5 7 2 4

1

2

8

9

7 19 10 22 12 6 15 8 17 9 2 8 6 17 6 3 11 8 22 8 5 1 1 4 5 6 9 21 8 9 11

19 15 11 15 S 15

Methods 3 4 5

112 28 35 87 22 27 64 20 24 84 26 32 48 8 1 1 91 15 21

6

7

Table 2.

amiliar with Method

tion. For example, a student required to take a course will rate it lower than one who takes it willingly. 9) Younger teachers are rated much higher than older teschers with essentially the same qualities of teaching effectiveness. 10) Students give remarkably similar ratings to instrueton who are known to he greatly different in teaching effectiveness.

Ouestion b

112

35

28

19

22

19

Question d . If your answer to o w m I (student ratings), who designed the questionnaire? Who administered the questionnaire?

A reason for the greater preference of the professors for departmental examination comparisons is not as obvious.

Some 60% of the student evaluations utilized questionnaires designed by the faculty, administration, or college testing service. In 36% of the cases. auestionnaires were desi'ed by students. The adminis&afion of the evaluation questionnaires showed a parallel division. with 58% of the evaluations being administered by facuity, administration, or college testing service, and 36% by students. (The remaining cases involved shared responsibilities between students and faculty.) Question e directed similar queries to those familiar with method 2 (rating of former students), hut replies were not numerous enough to be significant.

Question b. Do you regard youreuoluntions asueeess?

Tahle 2 summarizes the answers to this question-hut only for those methods with which a significant number of respondents had claimed familiarity. These were 1) Student ratings 2) Former student ratings 3) Classroom visitation 51 Performanceon drpanmental examinnrions 71 Examination of mnrcrinl used in thecoursc 9) Evaluntion ufcummitree work of initructur

Question f . Regardless of which of the oboue methods you h o w used, what method do you regard as the most reliable?

Approximately half of those replying claimed to have found the methods which they had employed a d e q u a t e l y successful. It should he noted that for former student ratings and for classroom visitations, the percentage who considered the method successful after it was tried was greater than the percentage who called student evaluation successful after trying that. An opinion as to the success of a given method of evaluation is related to desired ohjectives. The questionnaire did not raise the question of objectives or of use of data obtained, hut many of the comments received were directed to this point, suggesting that in order to place first things first, goals and objectives must he clearly understood before a method of evaluation is chosen. Breaking the replies into those of the two groups, professors and chairmen, contributed little further information. The pro~ortioniudeine methods adequately, partially, or not~su&essful~in e a c h group corresponded approximately to the proportions of all replies.

Tahle 3 presents the interesting replies to this question. Fourteen different methods were voted most reliable by someone. A striking fact came to light with these figures. While 87% of all respondents indicated that they were using or had used student evaluations, only 37% of all replies voted for it as most reliahle. For comparison, 22% of the whole group replied that they had used or were using method 2 (ratings of former students) and exactly the same percentage listed it as most reliable! Fourteen, or 11%, of the group refused to select one method, recommending, instead, a comhination of methods. The choice of a combination was not suggested in the questionnaire. Only through written comments were respondents able to indicate their opinion that no one method alone is adequate, hut only a combination of several. Breaking down the replies into our two categories of chairmen and professors showed that although administrators selected the same methods as did the professors, their support of the first three methods was somewhat less. Methods 4 and 5, which utilize actual achievement tests, were each selected by four professors, hut by no chairmen. Nine of the fourteen who insisted that only a combination of methods will yield valid results were chairmen.

Question e. If your answer to Question b is no, what did you find unsatisfactory about your eualuotion?

A total of 25 expressed various opinions as to the reasons for failure of a specific method. Since many more had had experience with student evaluations than with any other method, comments about this method predominated. Some of these were 1) Student evaluations turn out to he popularity contests. A poor teacher who becomes "buddy-buddy" with students may receive high ratings. Students are overly impressed by popularity, personality, and appearance. 2) There is some correlation of response on questionnaires to grade in the course. 3) Students cannot separate course objectives, course standards, and their own abilities. 4) A good instructor who has to teach a difficult course will not fare well with student evaluations. 5) Students are relatively uncritical. 6 ) Freshmen are too immature to evaluate teaching effeetiveness. 7) Can freshmen distinguish between showmanship and effective teaching? 8) A student's motives for taking a course affect his evalua-

Question g. Da you regard any of the aboue methods as unfeasible because of the difficulty ofadministrotion?

Answers to Question g explained to us a t least in part why 87% of all respondents had used student evaluations, but only about one-third regarded them as the most reliable method. Only four persons indicated that they felt student evaluations to he unfeasible in administration. None of the other methods received so complete an endorsement on this noint. Each of the other methods was considered unfeasible by from 10 to 15% of those replying. Student evaluation is a ~ ~ a r e n t lthe v easiest method to use, relative to the quanti& of data obtained. Question h. If you were to be euoluated yourself, which of the aboue methods would you want the evaluator to use? Which wauldyou not wont him to use?

Table 3. Question f Methods 1

All Replies Percentme

4 37

3

2 8

2 22

8 15

5

4 1

9

6 4

7 4

3

9 5

11 1

14 1

16 3

1

16 2

17 1

Comb.

21 1

1

4

Volume 51, Number 3, March 1974 1 153

Student evaluations were chosen by 48% of those queried as the method they would prefer to have used on themselves. One might speculate that of this 48% 33% were those who felt it was the most reliable method, and the other 1590 were those who know they come off welleven if the method isn't reliable! Former student ratings were chosen hv 39% and visitation hv colleacues hv 24%. There was apparently little strong feeling against sinele method. with the exce~tionsof method 8 (examination of pedagogical materials published) and 9 (participation in committee work) whicb were voted against by 19% and 27%, respectively.

any

-

Question j. Do you think teaching con be eualuated? Cheek one of the following: YES, and it eon be done satisfactorily with the erpenditure of o reasonable amount of time, money, andeffort. YES, BUT the money, time, and effort required to do the job satisfactorily makes the eualuation of teaching unfeasible. NO, the evaluation of instruction cannot be done satisfactorily even with the expenditure of large amounts of time, money, and effort.

Question j was deliberately placed a t the end of the questionnaire to be considered after thought had been given to such things as reliability, feasibility of administration, etc. Table 4 shows the answers and corresponding percentages. Eighty-seven percent of those who replied to this question gave an unqualified "yes." There was little difference between the replies of the two groups, chairmen and professors. Although the chemistry educators we contacted thus agreed almost completely that teaching can he evaluated, we must bear in mind that only 65% of the recipients of the questionnaire provided these replies. It is impossible to judge what effect the replies from the non-respondents would have had on the statistics of this paper. Certainly if all the non-respondents did not reply because they felt that teaching effectiveness cannot be measured, then the statistics presented here are skewed. Replies which arrived following our follow-up letter whicb urged them to reply even if they had not bad experience with evaluation contained a higher percentage of answers from those who were somewhat sceptical about the methods for the measurement of teaching effectiveness. Interesting comments were included on many questionnaires, showing that respondents had devoted considerable effort and thought to this very complex and frustrating problem. Only a portion of these comments are repeated here. Table 4. AU Replies Percentage

154

Ouestion i

YES

YES, BUT

NO

103

11

4 3

81

/ Journal of Chemical Education

9

1) The basic problem is defining good teaching. It does not necessarily equal a smooth lecture style, friendly attitude, ete. Once we decide what we are about, then maybe we can get down to measuring teaching on a realistic basis. 2) We can recognize or identify the very good and the vely bad-beyond this we are kidding ourselves. 3) The most important aspect of evaluation is that it makes teachers aware of their shortcomings and exerts pressure to improve. 4) Evaluation of teaching depends upon the enthwiastie cooperation of all three parties involved-faculty, students, and administrators, and it must be accomplished for the right reason-a real and conscientious interest in learning, and not "because everyone else is doing it," or "because we have a lousy bunch of teachers." 5) Intelligently designed evaluation programs (taken seriously, hut not too seriously) tend to discourage student-oriented and student-directed evaluations, which are mixtures of a few good ideas and many bad ones. 6) If one discards the idea that student evaluations correlate with teaching effectiveness, one can obtain useful information about student response to a given course or instructor-but only by detailed analysis of the results. To do this properly student anonymity must be abandoned-to date, no one has had the courage to insist on this. 7) Occasionally (but not often) useful suggestions for improvement of a course or a method of instruction can be obtained. More frequently the need for explaining why the course or some portion of it is taught the way it is may become apparent from student comment. 8) It may take effort, but it has to he done. It is only fair to the students and faculty. We can't ignore the need. Our principal function is to educate-to teach in all of its ramifications and if we don't do it well then we belong in some other career

The results of our questionnaire are not presented as conclusive answers to any of the questions surrounding the evaluation of teaching effectiveness. They may he taken as an indication of some general opinions, however. We found what seems to be considerable agreement among chemists in education that teaching can be evaluated with the expenditure of reasonable time, money, and effort. But certainly no unanimity of opinion as to the methods to employ was found. Student evaluations are presently being employed for this purpose more frequently than other methods. It is possible, however, that their selection has been more because of their ease of administration than for any certainty about their reliability. The repeated comment to the effect that no one method should be considered sufficient for any effective evaluation is significant. The lack of agreement as to the use to which the data is to he put was also evidenced through many comments. Some schools apparently found themselves literally pushed into an evaluation program to forestall a less efficient, less well-conceived, student-run program. So the "evaluations" were obtained hefore the goals and objectives had been defined or agreed upon. There is obviously as much need for consideration of these points as for consideration of the actual mechanics of evaluation.