Just as millions of American college students are about to rate the teaching abilities of their professors this month, a pair of University of Washington researchers say such evaluations are flawed and often misused.
Instructors who teach demanding courses, which tend to be concentrated in science, mathematics and engineering, are often penalized with undeservedly low ratings, while teachers of easier courses are often rewarded with unfairly high ratings, according to Anthony Greenwald and Gerald Gillmore.
“Instructors of science and math suffer the worst under the current evaluation system and are at the bottom of the ratings because they teach tough courses, give lower grades and demand a lot of hard work,” explains Greenwald, a psychology professor.
“Our research has confirmed what critics of student ratings have long suspected, that grading leniency affects ratings. All other things being equal, a professor can get higher ratings by giving higher grades,” adds Gillmore, director of the UW’s office of educational assessment.
The two researchers’ criticisms, which are counter to much prevailing opinion in the educational community, stem from a new study of evaluations from 600 classes representing the full spectrum of undergraduate courses offered at the UW. Their study is described in a paper being published in the December issue of the Journal of Educational Psychology and in two papers published in a special section edited by Greenwald in the November issue of the American Psychologist.
Despite shortcomings in most rating systems now being used, Greenwald and Gillmore don’t advocate abandoning student evaluations. They believe that ratings are needed and clearly have valid components. What they propose is to reform how ratings are done, and a number of their ideas already have been incorporated into the evaluation forms now used at the University of Washington.
“What we most want to know is how much students have learned. However, asking students to evaluate what they have learned is like asking hospital patients to judge medical care they’ve received,” explains Greenwald. “Hospital patients certainly know how they feel about the treatment they’ve received, but those feelings can be due more to the physician’s bedside manner than to the physician’s medical expertise. Similarly, we need to be careful not to confuse classroom manner with teaching skill.”
The UW researchers also contend that information provided by ratings can be and often is misused because college personnel committees don’t understand how student evaluations are influenced by factors such as grades and a teacher’s ability to entertain, and how ratings can overlook some important ingredients of instruction.
“The process of good teaching in science and math is more businesslike and there is a heavier workload outside the classroom, while instruction in other kinds of classes can be more entertaining,” says Gillmore. “My concern is that we are using a system that penalizes an instructor who may actually have the best approach for a particular kind of course.”
The University of Washington has a “vested” interest in student evaluations. It was, along with Harvard, a pioneer in the use of evaluations back in 1920s under the leadership of Edwin Guthrie, a UW psychology professor and later dean of the graduate school. In the past 30 years, evaluations have become commonplace on American campuses. They are used to give instructors feedback on the quality of their teaching; have become an important consideration in matters of faculty tenure, salary increases and promotion; and also provide information to students.
The UW research provides ammunition to recent critics who have argued that teachers’ desires to obtain high ratings leads them to “dumb down” courses, producing what Greenwald and Gillmore describe as a “higher education lite” for the 1990s.
“One likely impact is that evaluations may encourage faculty to grade easier and make course workloads lighter to get higher evaluations,” says Greenwald. “The end effect to the consumer — the student — may not really serve the educational system or society.”
In their study, Greenwald and Gillmore looked at 600 classes — about 200 in each of three consecutive academic quarters — while introducing some new measurement tools to estimate the influence of lenient grading on the evaluation process.
Among these was a question that asked students how many total hours per week they spent on a course, giving them 11 choices ranging from under two hours to 22 or more hours. In addition, students were asked how the grade they expected from a class compared to their grades in other courses. These questions have now been included in the UW’s standard evaluation forms.
Greenwald and Gillmore’s research was partially funded by the National Science Foundation and the National Institute of Mental Health.
A supplementary overview of the Greenwald-Gillmore research is available via the world wide web at http://faculty.u.washington.edu/agg/paingain/supplement.html
Additional sources of information at the UW:
Fred Campbell, dean of undergraduate education, (206) 616-7175 or email@example.com
Debra Friedman, associate dean of undergraduate education, (206) 616-7175 or firstname.lastname@example.org
Mark McDermott, chair of the UW Faculty Senate and physics professor, (206) 543-2442 or email@example.com
Earl Hunt, psychology professor and member of a faculty senate committee studying student evaluations, (206) 543-8995 or firstname.lastname@example.org
Defenders of the current evaluation system:
William McKeachie, University of Michigan professor of psychology, (313) 763-0218 or (313) 426-8818
Philip Abrami, professor and director of the Centre for the Study of Learning and Performance, Concordia University, Montreal, Quebec, (514) 848-2000 or email@example.com
Critic of the current system:
Stephen Ceci, Cornell University professor of developmental psychology, (607) 255-0828 or firstname.lastname@example.org <!—at end of each paragraph insert