Mary-Lee Mulholland (Mount Royal University)

The Utopia of Peer Evaluations of Teaching – A Cautionary Tale

In The Utopia of Rules: On Technology, Stupidity, and the Secret Joys of Bureaucracy David Graeber (2015) reflects on how bureaucracy, through an economy of paperwork, evaluation, and performance reviews, has spread from the corporate sector to the government, not-for-profit, and education sectors.  Universities are not exempt from this bureaucratic boom, but rather are ripe for its proliferation. The corporatization of universities through administrative bloat, performance measures, audit culture, bureaucracies of virtue, and discourses of “excellence” has been analyzed and thoroughly critiqued by many anthropologists (Graeber 2015, 2018, Krautwurst 2013, Menzies 2010, Roseman 2010, Shore and Wright 1999). One such economy currently booming is the peer evaluation of teaching used in employment, tenure, and promotion decisions at universities.  

As Graeber notes, the rise of many of these bureaucratic systems stems from “good intentions run amok” (2015: 8).  This is certainly the case of the bureaucratization of excellence in teaching.  Since the 1990s, there has been an important and worthy shift in universities to recognize and invest in teaching as an important part of our work (Bernstein 2008, Hutchings 1995).  As a result, we have witnessed the growth of teaching support centres, awards, teaching grants, Scholarship of Teaching and Learning (SoTL), and an emphasis on student evaluations and student surveys such as the National Survey of Student Engagement (NSSE). This is even reflected within CASCA with the introduction of CASCA Awards for Teaching Excellence and the formation of the Critical Pedagogy in the Canadian Anthropology Network.

 I think the recognition and support for good teaching are incredibly important and these are things I am quite passionate about. It is vital that our teaching, like our research, is well-informed, engaged, and rigorous. After all, it is through our teaching that anthropology makes its biggest impact. However, this emphasis on teaching, at least in part, comes from the neoliberal understanding of universities as service providers (rather than knowledge producers), exemplified by the student-as-consumer phenomenon (Bunce et al., 2017).  In this context, “good teaching” practices are packaged under trendy terms such as HIPs (high impact teaching practices), active learning, or service-learning (Mulholland 2016), while the art of the well-developed lecture seems to be underappreciated (Shiva 2021).  This focus on good teaching at universities occurs at the same time instructors are burdened with higher course loads, precarious employment, increased class sizes, and mounting service duties.  Likewise, as our teaching and service loads increase, support and time for research are dwindling. This, despite the fact that small class sizes and ongoing engagement with developments in our field are two of the most important factors in “good teaching” (Cadez et al 2017, Mulholland 2016). 

Following the bureaucratic logic described by Graeber, as good teaching becomes more central to the condition of our employment, it must, therefore, be evaluated.  The first measure utilized to assess teaching was student evaluations in the 1960s and within a few decades universities began using student evaluations in various types of performance reviews, including tenure (Gelber 2020).  Almost immediately, there was a backlash by teachers and researchers warning against the pitfalls of student evaluations, including the problems of bias, statistical insignificance, and the inability of students to assess the expertise and pedagogy of their instructors (Heffernan 2022).  In fact, many unions and faculty associations today have secured the right for student evaluations to be excluded from formal assessments of teaching.   It is in this context that peer evaluations of teaching emerged as a viable alternative (Cavanaugh 1996, Hutchings 1995). 

In the early 1990s, scholars such as Pat Hutchings with the American Association for Higher Education (1995) began researching and advocating for “peer reviews” of teaching in universities and colleges. Importantly, these early studies focused on using peer reviews of teaching for formative, not summative, purposes. This distinction is key. Formative peer observations are characterized as self-reflective, inquiry-based, experimental, and collaborative (Centra 1993, Iqbal 2014, Yiend et al. 2014).  Moreover, highly effective peer observations are reciprocal in which faculty members observe each other’s classes (Yiend et al. 2014). The research on formative peer observations overwhelmingly and conclusively indicates that these are worthwhile and effective strategies for growth and development in teaching.  In contrast, summative peer observations of teaching are used for accountability, a measure of performance, and quality assurance in the evaluation of precariously employed or pre-tenure instructors.  The research is also very clear, that when used for summative purposes many of the benefits of peer observations are lost (Cavanagh 1996, Centra 2003).  This is due to peer bias and subjectivity, reluctance to critique vulnerable colleagues in this context, loss of reflexive and collaborative features, redundancy, and power imbalances.  In fact, many of these issues are the same as those raised with student evaluations.  

Despite these concerns, summative peer evaluations of teaching are becoming increasingly prolific at universities across Canada.  As a case in point, let me share the bureaucracy of peer evaluations of teaching at my own university. When I began at Mount Royal University in 2010, the tenure process was five years.  During this time, I was required to have three peer observations annually for a total of fifteen over the tenure process. A few years into my tenure, the university reduced this to seven observations over five years (three of these must be completed by the chair in the first, third, and fourth-year).  In addition, contract faculty are also required to have evaluations completed by the chair or chair-designate every three years (notice the substitution of peer with chair). The evaluations, although said to be both formative and summative, are largely summative and highly redundant: summative because they are used as criteria for employment (for the precariously employed) and tenure, and redundant because there is very little variation between the various observations as the vast majority are positive. Moreover, at Mount Royal University peer evaluations are the responsibility of tenured faculty and chairs (who in fact do many of these evaluations), which clearly undermines the principle of peer-to-peer observation. 

Quite simply, these evaluations are used to check a box indicating whether the person can or cannot teach.  These evaluations only become formative if the instructor is struggling in the classroom and requires significant development in order to meet the standard of a “good teacher.”  In addition, these peer evaluations have become a major (and I would argue unnecessary) service burden on tenured colleagues who are required to do them.  At Mount Royal University, tenured faculty must attend a training workshop on peer evaluations of teaching, attend a pre-observation meeting, observe a class, attend a post-observation meeting, and complete a multi-page form with five sections requiring written observations and analysis. As we are a small university with small departments, doing peer evaluations is a regular part of our service and a particular burden for chairs. 

This year, I chaired a committee for my faculty association that monitors the evaluation of faculty (this includes student and peer evaluations of teaching, annual reports, and tenure and promotion). And, yes, I do see the irony in a committee to evaluate evaluations.  We did a quick survey of other universities in Canada and discovered most require 2-3 peer evaluations of teaching for pre-tenure and the majority have no requirements for precariously employed faculty.  However, many of these universities are increasing the frequency of these observations (many seem to be moving toward annual observations) for pre-tenure, introducing them for precariously employed instructors, and building the bureaucracy that goes with them. Any quick search of a university’s website will find training videos, workshops, criteria, forms, reports, and requirements of peer observations of teaching. Peer teaching evaluations are quickly becoming an industry unto themselves. 

At this point, allow me to circle back to David Graeber and his article “Are you in a BS job? In academe, you’re hardly alone” (2018) where he states: 

In most universities nowadays—and this seems to be true almost everywhere—academic staff find themselves spending less and less time studying, teaching, and writing about things, and more and more time measuring, assessing, discussing, and quantifying the way in which they study, teach, and write about things (or the way in which they propose to do so in the future). 

With this I caution my colleagues who are looking to embrace peer evaluations of teaching as a counter to student evaluations or as a means to ensure and support good teaching – they are not the affirmation of good teaching you are looking for.  When used for summative purposes, these evaluations are largely bureaucratic and undermine the potential of collaborative and reflexive peer observations of teaching. In short, more summative peer evaluations of teaching will not lead to better teachers. Rather, they take our labour away from what really matters – teaching and research. 


Bernstein, D. J. (2008). Peer review and evaluation of the intellectual work of teaching. Change: The Magazine of Higher Learning, 40(2), 48-51. 

Bunce, L., Baird, A., & Jones, S. E. (2017). The student-as-consumer approach in higher education and its effects on academic performance. Studies in Higher Education42(11), 1958-1978.

Cadez, S., Dimovski, V., & Zaman Groff, M. (2017). Research, teaching and performance evaluation in academia: the salience of quality. Studies in Higher Education42(8), 1455-1473.

Cavanagh, R. R. (1996). Formative and summative evaluation in the faculty peer review of teaching. Innovative higher education, 20(4), 235-240. 

Centra, J. A. (1993). Reflective Faculty Evaluation: Enhancing Teaching and Determining Faculty Effectiveness. The Jossey-Bass Higher and Adult Education Series: ERIC.

Gelber, S. M. (2020). Grading the college: A history of evaluating teaching and learning. Johns Hopkins University Press.

Graeber, D. (2015). The utopia of rules: On technology, stupidity, and the secret joys of bureaucracy. Melville House.

Graeber, D. (2018). Are you in a BS job? In academe, you’re hardly alone. The Chronicle of Higher Education6.

Heffernan, T. (2022). Sexism, racism, prejudice, and bias: a literature review and synthesis of research surrounding student evaluations of courses and teaching. Assessment & Evaluation in Higher Education47(1), 144-154.

Hutchings, P. (1995). From idea to prototype: the peer review of teaching: a project workbook: AAHE Teaching Initiative, American Association for Higher Education.

Iqbal, I. (2013). Academics’ resistance to summative peer review of teaching: questionable rewards and the importance of student evaluations. Teaching in higher education, 18(5), 557-569. 

Krautwurst, U. (2013). Why we need ethnographies in and of the academy: Reflexivity, time, and the academic anthropologist at work. Anthropologica, 261-275.

Mulholland, M.L. (2016). Do These HIPS Lie?: Neoliberalism, Academic Plans and a Budget Crisis at Mount Royal University. Culture,10(1). https://cascacultureblog.wordpress.com/2016/05/04/these-hips-do-lie-neoliberal-rhetoric-rankings-and-the-budget-crisis-at-mount-royal-university/

Menzies, C. R. (2010). Reflections on Work and Activism in the ‘University of Excellence.’. New Proposals: Journal of Marxism and Interdisciplinary Inquiry3(2), 40-55.

Roseman, S. R. (2010). Introduction: New perspectives on the business university. New Proposals: Journal of Marxism and Interdisciplinary Inquiry3(2), 5-8.

Shore, C., & Wright, S. (1999). Audit culture and anthropology: Neo-liberalism in British higher education. Journal of the Royal Anthropological Institute, 557-575.   

Shiva, A. (2021). Remote Teaching and the Revival of Time-Tested Styles and Tools. Culture, 15 (1).https://cascacultureblog.wordpress.com/2021/05/13/remote-teaching-and-the-revival-of-time-tested-styles-and-tools/  

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s