Reconceptualisation of approaches to teaching evaluation in higher education
Nga D Tran
University of Tasmania
The ubiquity of using Student Evaluation of Teaching (SET) in higher education is inherently controversial. Issues mostly resolve around whether the instrument is reliable and valid for the purpose for which it was intended. Controversies exist, in part, due to the lack of a theoretical framework upon which SETs can be based and tested for their content validity. In this paper, based on Bigg's 3P model of teaching and learning, three approaches to teaching evaluation were derived, namely, (i) student presage focused, (ii) teaching focused, and (iii) learning focused. Each approach adopts a particular belief about knowledge, perception of teaching, and a distinctive focus of teaching evaluation. We argue that the adoption of a learning focused approach to teaching evaluation is necessary in SET development as this will provide feedback for all parties involved in teaching and learning (the teachers, the administrators, and the students) about what each party needs to do to achieve the intended learning outcomes.
Much of the focus of arguments about the "improve-prove" function dichotomy (Barrie, 2001) of SETs was paid to teaching as an end in itself, and assumed a correlation existed between student ratings and student learning. However, findings from intensive investigations of SETs over the last two decades have provided contrary evidence (Carrell & West, 2010; Galbraith, Merrill, & Kline, 2012; Kember & Leung, 2009; Pounder, 2007; Schuck, Gordon, & Buchanan, 2008; Zabaleta, 2007), that is, there were low or even no correlations between SET scores and student learning. More recently, it was found that teachers receiving higher SET scores tended to excel more at contemporaneous student achievement (teaching to the test), but harm the follow on achievement of their students (Carrell & West, 2010). Carrell and West further concluded that high SET scores were actually associated with lower levels of deep learning. These empirical studies, therefore, have challenged the content validity of SETs.
Findings from studies on SETs have found evidence both for and against the reliability and validity of SETs. On the one hand, multiple section studies such as of Marsh (1984, 1987, 2007), Cohen (1981, 1982), and McKeachie (1996, 1997) have found that there were correlations, though not significant, between SET scores and some measures of student achievements such as a common final examination. SETs, therefore, were considered to be "quite reliable" and "reasonably valid" instruments to evaluate university teaching (Marsh, 1987, p. 369). On the other hand, many studies have challenged the widely accepted validity of SETs (for review, see Pounder, 2007). Dowell and Neal (1983, p. 462), for example, observed "student ratings are inaccurate indicators of student learning and they are best regarded as indices of 'consumer satisfaction' rather than teaching effectiveness". The search for evidence of invalidity for the use of SETs continues to this day. Arguments have been advanced that SETs were influenced by student related factors such as their perceptions of teaching or their maturity (e.g., Aleamoni, 1981; Crumbley, Henry, & Kratchman, 2001) or by teacher related factors such as the appearance, likeability, and popularity of the teachers (Boysen, 2008; McNatt, 2010).
As research on teaching evaluation in general, and the use of SETs in particular, was with the subject of continual questioning, the underlying problems in evaluation were revealed to be more complex than the simple improve-prove function of SETs or the valid-invalid/reliable-unreliable dichotomy surrounding SETs suggested (Theall, 2010). The main problem, as several researchers (Barrie, 2001; Barrie, Ginns, & Symons, 2008; Edstršm, 2008; Kolitch & Dean, 1999; Saroyan & Amundsen, 2001) argued, lay in the "fragile foundation" (Darwin, 2010) in the development of the instrument. Teaching evaluation systems, particularly SETs, "reflect a range of variables including implicit and explicit beliefs about what constitutes quality teaching or learning in particular contexts, and hence what is important to be measured. Beliefs about who should do the measurement and what the measurement might mean" (original emphasis) (Barrie, et al., 2008) the theoretical basis for SETs and how they relate to both the established theoretically sound models of teaching and the new ideas emerging from the research on university teaching deserve more attention (Barrie, 2001; Burden, 2008; Darling-Hammond, Wise, & Pease, 1983).
Figure 1: Biggs's 3P model of teaching and student learning
The theoretical understanding about teaching and learning in Bigg's 3P model provides a useful framework for the understanding of different approaches to teaching evaluation. Three approaches to teaching evaluation were derived, namely, (i) student presage focused, (ii) teaching focused, and (iii) learning focused. Each approach adopts a particular belief about knowledge, perception of teaching, and distinctive focus of teaching evaluation.
Figure 2 captures the understanding of teaching and the focus of teaching evaluation underpinning the student presage approach. The directional flow proceeded from the teaching context, identified mainly as the teacher' content knowledge, to the students and to their learning outcomes. Therefore, in this approach, an evaluation of the teaching system would exclude instruments designed to gain students' feedback for teaching since teachers were understood to have no responsibility if students did not learn. In Arreola's (2007, p. 18) words, a SET was unnecessary and invalid, because "students, by definition, would not have the teacher's content expertise and would thus not be qualified to make any sort of evaluative statements or conclusions concerning the teacher's competence".
Figure 2: The student presage focused approach to teaching evaluation
Examples for the existence of this approach came from very limited student rating instruments available, one of which was the Student Instructional Rating System Form designed and used by Michigan State University in the 1960s (Marsh, 1987, p. 381). The Form was designed with 30 items: the first 24 items were concerned with the characteristics of the teachers, and the remaining six items with student's background, including items concerned with student motivation to do the course, or overall GPA (grade point average). Collecting data on students' characteristics, therefore, to some extent, was seen as a way of measuring the effectiveness of teaching.
The 1960s were a turning point with respect to the paradigm shift from the view of teaching as the transmission of information toward a view of teaching as facilitating learning (Theall, 2010). However, this shift did not fully occur until late in the twentieth century. During the transitional stage from instruction paradigm to learning paradigm (Barr & Tagg, 1995), SETs were increasingly seen as an important channel for teachers and administrators to gain feedback about the quality of teaching and learning.
Figure 3: Teaching focused approach to teaching evaluation
Accordingly, the teaching focused approach to evaluation defined teaching tasks in terms of the capacity to carry out detailed, and in most cases, pre-determined instructions (Biggs & Moore, 1993). Measures of teaching were associated not only with measures of a teacher's content knowledge, but were extended to measures of a repertoire of specified techniques for delivering the pre-determined content. An instrument to evaluate teaching, such as SETs, was considered valid if it accurately identified a teacher's deficiency in teachers' characteristics and/or teaching skills. The dotted line in Figure 3 is indicative of a feedback's loop, from the students to the teaching context.
The teaching focused approach, as described above, takes one of the two main forms of feedback: (i) feedback from students' observation of the teaching context, and (ii) feedback that was mediated through the students' perception of such a context. The former had its roots in behaviourism, in which learning was seen as a change in observable behaviour that occurred as the result of experience (Eggen & Kauchak, 2006). To make learning happen, the teacher "tells, shows, models, demonstrates, and teaches the skill to be learned" (Palincsar, 1998, p. 347). Driven by behaviourism, SETs especially in their early days were usually comprised of items which asked what students thought "of their teachers and how they feel about him as a personality" (Smalzried & Remmers, 1943, p. 363). Support for the measures of teaching to be the measures of a teacher' personal traits came from studies that found "statistically significant average correlations between the traits and overall evaluation" (Feldman, 1986, p. 139). As a result, "teachers' predispositions", that is, what the teacher brings to the teaching situation (Abrami, d'Apollonia, & Rosenfield, 2007) were the main focal points in the development of SETs. Students were asked to evaluate teachers' general characteristics that were not necessarily associated with teaching. In the Student Description of Teaching Questionnaire designed by Hildebrand in the 1960s and 1970s, there were items that asked students if the teacher "were friendly toward students" (Item 23) or "varies the speed and tone of his/her voice" (Item 34) (Marsh, 1987, p. 387).
The second form of student feedback that was mediated through students' perceptions of teaching context about teaching came from the realisation that student perceptions would determine their approaches to learning, and affect their learning outcomes (Marton, Hounsell, & Entwistle, 1984; Ramsden, 2003; Trigwell & Prosser, 1996). Thus, from a phenomenographic perspective, collecting student feedback on teaching needed to be done through the investigation of "student's perception of the usefulness of teaching behaviours in helping them learn" (Barrie, 2001, p. 11). Barrie further argued that SET items of "The instructor clearly stated the objectives of the course", or "The lecturer spoke clearly" (item bank, cited in Lally & Myhill, 1994, p. 80) could be paraphrased as "The objectives of the course were clear to me" and "I found the lecturer's speech easy to understand" (emphasis added), to reflect students' interpretations of what was said. Although in Barrie's proposed items students were asked to evaluate how useful their teachers' teaching were for them, the subject of the evaluation remained what the teacher does.
An approach to teaching evaluation with a focus on teaching presage factors reflected an assumption that standards of professional knowledge and practice could be developed and assessed, and that their enforcement would ensure competent teaching and subsequently lead to high quality learning outcomes (Darling-Hammond, et al., 1983). Indeed, teaching skills are necessary for teachers to be successful in teaching. However, when teaching is making learning possible (Ramsden, 2003), being successful in teaching needs to take into account students' learning. As a result, an approach to student evaluation of teaching which places the onus on "surface aspects" of teaching (Pratt, 1997) needs to shift to the evaluation of the "substance" of teaching, that is, the students learning outcomes that were informed by their approaches to learning.
Figure 4: Learning focused approach to teaching evaluation
In focusing on what the students should be able to do as a result of the teaching, the responsibility for learning did not reside in the students alone, nor in the teachers and their teaching alone, but in all involved (Biggs, 1993). Restating the needs for a paradigm shift, Barr and Tagg (1995, pp. 14-15) strongly advocated for "learning paradigm" institutions which took responsibility for learning in order to produce learning (original emphasis). They further argued that students, teachers and the institution all have to take responsibility for student learning, even though none is in complete control of all the variables. Evaluation of teaching, therefore, instead of focusing on the act of teaching, should focus on student learning as the "consequences of those actions" (Abrami, et al., 2007).
Examples of SETs with focus on student learning included Student Assessment of Teaching and Learning (SATL) (Ellett, Loup, Culross, McMullen, & Rugutt, 1997), and the National Survey of Student Engagement (Kuh & Hu, 2001) in America, or its Australasian version (ACER, 2011), in which students' approaches to learning and student learning outcomes were measured as indicators of effective teaching. The National Survey of Student Engagement, in particular, was designed with items that map into seven outcome measures, one of which is the participation in higher order forms of thinking or the development of general forms of individual and social development. Recently, Kember and Leung (2009) developed a SET which was grounded in principles of excellent teaching, and was designed to identify "relative strengths and weaknesses in teaching so that appropriate remedial action can be identified" (p. 352). Accordingly, several items in their SET have reflected a change in the understanding of teaching, for example, "I found the course challenging", or "I have become more willing to consider different point of views" (Kember & Leung, 2009, p. 348). These proposed SET items suggested a reconsideration of placing the students and their learning at the centre of teaching evaluation. Although the construction of Kember and Leung's (2009) 49 item SET had not yet moved fully from a teaching presage focused approach, it signalled a transition to evaluate teaching that moved beyond "what the teacher does" to "what the student does".
Figure 4 details an approach to teaching evaluation with a focus on student learning. The dotted lines represent the flow of student feedback on teaching: students' approaches to learning and learning outcomes feed back to the student presage and the teaching presage factors. Student evaluation of teaching instrument, therefore, became the student evaluation of learning, providing feedback for teachers, administrators and students themselves about what each party needs to be done to achieve the intended learning outcomes.
|Components||Student presage focused||Teaching presage focused||Student learning focused|
|Belief about knowledge||External to the students||External to the students||Internal to the students|
|Perception of teaching||Imparting information||Transmitting teachers' understanding||Facilitating critical thinking, and enabling conceptual change|
|Purpose of the evaluation||Not necessary or punitive (if used at all)||Accountability and improvement of teaching||Enhancing learning and learning outcomes|
|Focus of the evaluation||What the student is||What the teacher is|
What the teacher does
|What the student does|
What the student has achieved
ACER (2011). Australasian survey of student engagement. http://research.acer.edu.au/ausse/
Akerlind, G. S. (2004). A new dimension to understanding university teaching. Teaching in Higher Education, 9(3), 363-375. http://dx.doi.org/10.1080/1356251042000216679
Aleamoni, L. M. (1981). Student ratings of instruction. In J. Millman (Ed.), Handbook of teacher education. Beverly Hills, CA: SAGE Publications
Arreola, R. A. (2007). Developing a comprehensive faculty evaluation system: A guide to designing, building and operating large-scale faculty evaluation system. San Francisco, CA: Anker Publishing.
Barr, R. B., & Tagg, J. (1995). From teaching to learning: A new paradigm for undergraduate education. Change, 27(6), 12-26. http://dx.doi.org/10.1080/00091383.1995.10544672
Barrie, S. C. (2001). Reflections on student evaluation of teaching: Alignment and congruence in a changing context. In E. Santhanam (Ed.), Student feedback on teaching: Reflections and projections (pp. 1-16). Perth: The University of Western Australia.
Barrie, S. C., Ginns, P., & Symons, R. (2008). Student surveys on teaching and learning: Final report. Sydney: Australian Teaching and Learning Council. http://www.itl.usyd.edu.au/cms/files/Student_Surveys_on_Teaching_and_Learning.pdf
Biggs, J. (1987). Student approaches to learning and studying. Melbourne, Victoria: Australian Council for Educational Research.
Biggs, J. (1993). From theory to practice: A cognitive systems approach. Higher Education Research & Development, 12(1), 73-85. http://dx.doi.org/10.1080/0729436930120107
Biggs, J., & Moore, P. J. (1993). The process of learning (3rd ed.). New York: Prentice Hall.
Biggs, J., & Tang, C. (2007). Teaching for quality learning at university (3rd ed.). London, England: SRHE and Open University Press.
Boysen, G. A. (2008). Revenge and student evaluations of teaching. Teaching of Psychology, 35(3), 218-222. http://dx.doi.org/10.1080/00986280802181533
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. Educational Researcher, 18(1), 32-42. http://dx.doi.org/10.3102/0013189X018001032
Burden, P. (2008). Does the use of end of semester evaluation forms represent teachers' views of teaching in a tertiary education context in Japan? Teaching and Teacher Education, 24(6), 1463-1475. http://dx.doi.org/10.1016/j.tate.2007.11.012
Carrell, S. E., & West, J. E. (2010). Does professor quality matter? Evidence from random assignment of students to professors. Journal of Political Economy, 118(3), 409-432. http://www.jstor.org/stable/full/10.1086/653808
Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research, 51(3), 281-309. http://dx.doi.org/10.3102/00346543051003281
Cohen, P. A. (1982). Validity of student ratings in psychology courses: A research synthesis. Teaching of Psychology, 9(2), 78-82. http://dx.doi.org/10.1207/s15328023top0902_3
Crumbley, D. L., Henry, B., & Kratchman, S. (2001). Students' perceptions of the evaluation of college teaching. Quality Assurance in Education, 9(4), 197-207. http://dx.doi.org/10.1108/EUM0000000006158
Darling-Hammond, L., Wise, A. E., & Pease, S. R. (1983). Teacher evaluation in the organizational context: A review of the literature. Review of Educational Research, 53(3), 285-328. http://dx.doi.org/10.3102/00346543053003285
Darwin, S. (2010). Exploring critical conceptions of student led evaluation in Australian higher education. In M. Devlin, J. Nagy & A. Lichtenberg (Eds.), Research and Development in Higher Education: Reshaping Higher Education, 33, 203-212. Melbourne, 6-9 July. http://www.herdsa.org.au/wp-content/uploads/conference/2010/papers/HERDSA2010_Darwin_S.pdf
Dowell, D. A., & Neal, J. A. (1983). The validity and accuracy of student ratings of instruction: A reply to Peter A. Cohen. The Journal of Higher Education, 54(4), 459-463. http://www.jstor.org/stable/1981908
Dunkin, M. J., & Biddle, B. J. (1974). The study of teaching. NY: Holt, Rinehart & Winston.
Edström, K. (2008). Doing course evaluation as if learning matters most. Higher Education Research and Development, 27(2), 95-106. http://dx.doi.org/10.1080/07294360701805234
Eggen, P. D., & Kauchak, D. P. (2006). Strategies and models for teachers: Teaching content and thinking skills (5th ed.). Boston, MA: Pearson Education.
Ellett, C., Loup, K., Culross, R., McMullen, J., & Rugutt, J. (1997). Assessing enhancement of learning, personal learning environment, and student efficacy: Alternatives to traditional faculty evaluation in higher education. Journal of Personnel Evaluation in Education, 11(2), 167-192. http://dx.doi.org/10.1023/A:1007989320210
Entwistle, N., & Walker, J. (2002). Strategic alertness and expanded awareness in sophisticated conceptions of teaching. In N. Havita & P. Goodyear (Eds.), Teacher thinking, belief and knowledge in higher education. Dordrecht: Kluwer Academic.
Feldman, K. A. (1986). The perceived instructional-effectiveness of college-teachers as related to their personality and attitudinal characteristics: A review and synthesis. Research in Higher Education, 24(2), 139-213. http://dx.doi.org/10.1007/BF00991885
Fox, D. (1983). Personal theories of teaching. Studies in Higher Education, 8(2), 151-163. http://dx.doi.org/10.1080/03075078312331379014
Galbraith, C., Merrill, G., & Kline, D. (2012). Are student evaluations of teaching effectiveness valid for measuring student learning outcomes in business related classes? A neural network and Bayesian analyses. Research in Higher Education, 1-22. 53(3), 353-374. http://dx.doi.org/10.1007/s11162-011-9229-0
Kember, D. (1997). A reconceptualisation of the research into university academics' conceptions of teaching. Learning and Instruction, 7(3), 255-275. http://dx.doi.org/10.1016/S0959-4752(96)00028-X
Kember, D., & Leung, D. Y. P. (2009). Development of a questionnaire for assessing students' perceptions of the teaching and learning environment and its use in quality assurance. Learning Environments Research, 12(1), 15-29. http://dx.doi.org/10.1007/s10984-008-9050-7
Kember, D., & McNaught, C. (2007). Enhancing university teaching: Lessons from research into award-winning teachers. Abingdon, OX: Routledge.
Kolitch, E., & Dean, A. V. (1999). Student ratings of instruction in the USA: Hidden assumptions and missing conceptions about 'good' teaching. Studies in Higher Education, 24(1), 27-42. http://dx.doi.org/10.1080/03075079912331380128
Kuh, G. D., & Hu, S. (2001). The effects of student-faculty interaction in the 1990s. Review of Higher Education 24(3), 309-332. http://dx.doi.org/10.1353/rhe.2001.0005
Lally, M., & Myhill, M. (1994). Teaching quality: The development of valid instruments of assessment. Canberra: Australian Government Publishing Service.
Marsh, H. W. (1984). Students' evaluations of university teaching: Dimensionality, reliability, validity, potential biases, and utility. Journal of Educational Psychology, 76(5), 707-754. http://psycnet.apa.org/doi/10.1037/0022-0622.214.171.1247
Marsh, H. W. (1987). Student's evaluation of university teaching: Research findings, methodological issues, and directions for future research. International Journal of Educational Research, 11(3), 253-388. http://files.eric.ed.gov/fulltext/ED338629.pdf
Marsh, H. W. (2007). Students' evaluations of university teaching: Dimensionality, reliability, validity, potential biases and usefulness. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 319-383). Springer Netherlands.
Marton, F., Hounsell, D., & Entwistle, N. J. (Eds.). (1984). The experience of learning. Edinburgh: Scottish Academic Press.
McKeachie, W. J. (1996). Student ratings of teaching. In J. England, P. Hutchings & W. J. McKeachie (Eds), The professional evaluation of teaching. American Council of Learned Societies Occasional Paper No 33. http://archives.acls.org/op/33_Professonal_Evaluation_of_Teaching.htm
McKeachie, W. J. (1997). Student ratings: the validity of use. American Psychologist, 52(11), 1218-1225. http://psycnet.apa.org/doi/10.1037/0003-066X.52.11.1218
McNatt, D. B. (2010). Negative reputation and biased student evaluations of teaching: Longitudinal results from a naturally occurring experiment. The Academy of Management Learning and Education, 9(2), 225-242. http://amle.aom.org/content/9/2/225.short
Mortelmans, D., & Spooren, P. (2009). A revalidation of the SET 37 questionnaire for student evaluations of teaching. Educational Studies, 35(5), 547-552. http://dx.doi.org/10.1080/03055690902880299
Palincsar, A. S. (1998). Social constructivist perspectives on teaching and learning. Annual Review of Psychology, 49, 345-375. http://dx.doi.org/10.1146/annurev.psych.49.1.345
Pounder, J. S. (2007). Is student evaluation of teaching worthwhile?: An analytical framework for answering the question. Quality Assurance in Education, 15(2), 178-191. http://dx.doi.org/10.1108/09684880710748938
Pratt, D. D. (1997). Reconceptualizing the evaluation of teaching in higher education. Higher Education, 34(1), 23-44. http://dx.doi.org/10.1023/A:1003046127941
Prosser, M., & Trigwell, K. (1999). Understanding learning and teaching: The experience in higher education. The Society for Research in Higher Education & Open University Press.
Ramsden, P. (1992). Learning to teach in higher education. London: Routledge.
Ramsden, P. (2003). Learning to teach in higher education (2nd ed.). London: Routlege Falmer.
Saroyan, A., & Amundsen, C. (2001). Evaluating university teaching: Time to take stock. Assessment & Evaluation in Higher Education, 26(4), 341-353. http://dx.doi.org/10.1080/02602930120063493
Schuck, S., Gordon, S., & Buchanan, J. (2008). What are we missing here? Problematising wisdoms on teaching quality and professionalism in higher education. Teaching in Higher Education, 13(5), 537-547. http://dx.doi.org/10.1080/13562510802334772
Smalzried, N. T., & Remmers, H. H. (1943). A factor analysis of the Purdue rating scale for instructors. Journal of Educational Psychology, 34(6), 363-367. http://psycnet.apa.org/doi/10.1037/h0060532
Theall, M. (2010). Evaluating teaching: From reliability to accountability. New Directions for Teaching and Learning, 2010(123), 85-95. http://dx.doi.org/10.1002/tl.412
Theall, M., & Franklin, J. (2001). Looking for bias in all the wrong places: A search for truth or a witch hunt in student ratings of instruction? New Directions for Institutional Research, 2001(109), 45-56. http://dx.doi.org/10.1002/ir.3
Trigwell, K., & Prosser, M. (1996). Changing approaches to teaching: A relational perspective. Studies in Higher Education, 21(3), 275-284. http://dx.doi.org/10.1080/03075079612331381211
Zabaleta, F. (2007). The use and misuse of student evaluations of teaching. Teaching in Higher Education, 12(1), 55-76. http://dx.doi.org/10.1080/13562510601102131
|Author: Dr Nga D Tran is currently working at Haiphong Private University, Vietnam. She was awarded her PhD from the University of Tasmania, Australia. Her research interests include conceptions of teaching and learning, teaching evaluation, and quality assurance.|
Please cite as: Tran, N. D. (2015). Reconceptualisation of approaches to teaching evaluation in higher education. Issues in Educational Research, 25(1), 50-61. http://www.iier.org.au/iier25/tran.html