Student Evaluations of Professors Should be Illegal

Repeated studies have demonstrated their bias against women and minorities. Why do we use them to make employment decisions?

Texas Tech political scientist Kristina Mitchell advances a novel argument as to why “Student Evaluations Can’t Be Used to Assess Professors.”

Imagine that you’re up for a promotion at your job, but before your superior decides whether you deserve it, you have to submit the comments section of an internet article that was written about you for assessment.

Sound a little absurd?

That’s in essence what we ask professors in higher education to do when they submit their teaching evaluations in their tenure and promotion portfolios. At the end of each semester, students are asked to fill out an evaluation of their professor. Typically, they are asked both to rate their professors on an ordinal scale (think 1­-5, 5 being highest) and provide written comments about their experience in the course.

In many cases, these written evaluations end up sounding more like something out of an internet comments section than a formal assessment of a professor’s teaching. Everything from personal attacks to text-speak (“GR8T CLASS!”) to sexual objectification has been observed by faculty members who dare to read their evaluation comments at the end of the semester.

Okay, that’s not novel. We’ve known that for years. But she offers an interesting twist on something else we’ve long known:

But the fact that the evaluations can be cruel and informal to the point of uselessness isn’t even the problem. The problem is that there’s a significant and observable difference in the way teaching evaluations treat men versus women.

new study I published with my co-author examines gender bias in student evaluations. We looked at the content of the comments in both the formal in-class student evaluations for his courses as compared to mine as well as the informal comments we received on the popular website Rate My Professors. We found that a male professor was more likely to receive comments about his qualification and competence, and that refer to him as “professor.” We also found that a female professor was more likely to receive comments that mention her personality and her appearance, and that refer to her as a “teacher.”

The comments weren’t the only part of the evaluation process we examined. We also looked at the ordinal scale ratings of a man and a woman teaching identical online courses. Even though the male professor’s identical online course had a lower average final grade than the woman’s course, the man received higher evaluation scores on almost every question and in almost every category.

Think back to that promotion that you only get if you turn in the comments section on that article someone wrote about you. If you’re a woman, your comments are going to talk about whether you’re nice or rude and whether you’re hot or ugly, while for men, the comments will talk about how qualified you are. And on a scale of 1-5, a man is going to receive ratings that are, on average, 0.4 points higher than a woman.

That women are evaluated more harshly than men on these things isn’t surprising. And, even though I frequently get comments on my sartorial choices on evaluations, I’m not the least bit surprised women get much more of that.

Here’s where Mitchell (and Jonathan Martin, her co-author on the study) go beyond what I’ve seen before:

[W]e certainly are not the first study to look at the ways that student evaluations are biased against female professors. But we might be among the first to make the case explicitly that the use of student evaluations in hiring, promotion, and tenure decisions represents a discrimination issue. The Equal Employment Opportunity Commission exists to enforce the laws that make it illegal to discriminate against a job applicant or employee based on sex. If the criteria for hiring and promoting faculty members is based on a metric that is inherently biased against women, is it not a form of discrimination?

It’s not just women who are suffering, either. My newest work looks at the relationship between race, gender, and evaluation scores (initial findings show that the only predictor of evaluations is whether a faculty member is a minority and/or a woman), and other work has looked at the relationship between those who have accented English and interpersonal evaluation scores. Repeated studies are demonstrating that evaluation scores are biased in favor of white, cisgender, American-born men.

This is not to say we should never evaluate teachers. Certainly, we can explore alternate methods of evaluating teaching effectiveness. We could use peer evaluations (though they might be subject to the same bias against women), self-evaluation, portfolios, or even simply weigh the evaluation scores given to women by 0.4 points, if that is found to be the average difference between men and women across disciplines and institutions. But until we’ve found a way to measure teaching effectiveness that isn’t biased against women, we simply cannot use teaching evaluations in any employment decisions in higher education.

Evaluating teaching effectiveness is incredibly subjective. But study after study after study has demonstrated that student evaluations are all but worthless—and, yes, biased. Given that, I find Mitchell and Martin’s argument that using them in making hiring, retention, and promotion decisions amounts to illegal discrimination compelling.

FILED UNDER: Education, Gender Issues, , , , , , ,
James Joyner
About James Joyner
James Joyner is a Professor of Security Studies. He's a former Army officer and Desert Storm veteran. Views expressed here are his own. Follow James on Twitter @DrJJoyner.

Comments

  1. CSK says:

    A colleague of mine once remarked that student evaluations were used to justify hiring and firing decisions that were made on other grounds.

    6
  2. Franklin says:

    Properly evaluating professors or even grade school teachers would require tedious work, so nobody does it. It would take a competent administrator (preferably one who knows the subject material well) sitting in the actual classroom, or at least watching hours of boring video.

    Interviewing students might provide *some* insight, but using anonymous student evaluations has more minuses than pluses.

    4
  3. Hal_10000 says:

    This seems to be another iteration of, “this mechanism is poor, so let’s throw it away”. Student evaluations are a mess, but I would not throw them away completely because I’m averse to throwing away information. Sometimes, they do contain useful information (example: I dress more professionally when I teach than when I started because of comments on the evaluations; and I’ve noticed it improves student attention).

    I agree that there are biases. But, as with other things, student evaluations are not the *only* thing decisions are made on. One has to take a view that incorporates grades, retention, evaluations, etc. I think the corrective mechanism is to account for the biases; to know that evaluations of a female professor will be harsher for reasons that have nothing to do with her teaching skill. But I would be very hesitant to throw the evaluations completely into the garbage.

    4
  4. gVOR08 says:

    It’s of a piece with the times. Every time I see my doctor I get an email with an evaluation questionnaire. Management cares deeply about the opinion of patients. They don’t give a damn about quality of care, but they care deeply about return business. Same thing when I take my car in for service, after service people visit my home, after calls to help lines, and after many purchases. It’s marketing uber alles and it isn’t going away.

  5. Grewgills says:

    I can definitely speak to the student evals being taken very seriously. Perhaps the only thing taken more seriously is retention (number of withdrawals and grades below C). If kids fail or withdraw, often they don’t come back and the school doesn’t get their money. If the school is private and doesn’t have the protection of a big endowment those concerns are magnified.
    Professors (all teachers really) have to build student good will to maintain the eval levels necessary to keep admin happy. Some professors manage this by making courses easier, giving out more As and Bs. My conscience won’t let me do that, so I end up spending considerable extra time working with the students to manage that.

    1