Published

2021-07-12

Bayesian Multi-Faceted TRI Models for Measuring Professor's Performance in the Classroom

Un modelo TRI de múltiples facetas bayesiano para la evaluación del desempeño docente en el aula

DOI:

https://doi.org/10.15446/rce.v44n2.89661

Keywords:

Multi-faceted IRT model, professor performance, Bayesian inference, (en)
Modelo TRI de múltiples facetas, Desempeño del profesor, Inferencia bayesiana, (es)

Downloads

Authors

Evaluations of professor performance are based on the assumption that students learn more from highly qualified professors and the fact that students observe professor performance in the classroom. However, many studies question the methodologies used for such measurements, in general, because the averages of categorical responses make little statistical sense. In this paper, we propose Bayesian multi-faceted item response theory models to measure teaching performance. The basic model takes into account effects associated with the severity of the students responding to the survey, and the courses that are evaluated. The basic model proposed in this work is applied to a data set obtained from a survey of perception of professor performance conducted by Science Faculty of the Universidad Nacional de Colombia to its students. professor scores that are obtained as model outputs are real numerical values that can be used to calculate common statistics in professor evaluation. In this case, the statistics are mathematically consistent. Some of them are shown to illustrate the usefulness of the model.

Las evaluaciones del desempeño del profesor se basan en el supuesto de que los estudiantes aprenden más de profesores altamente calificados y el hecho de que los estudiantes observan el desempeño del profesor en el aula. Sin embargo, muchos estudios cuestionan las metodologías utilizadas para tales mediciones, en general, porque los promedios de las respuestas categóricas tienen poco sentido estadístico. En este artículo, proponemos modelos Bayesianos de Teoría de Respuesta al Ítem de múltiples facetas para medir el desempeño. El modelo propuesto tiene en cuenta los efectos asociados con la severidad de los estudiantes que responden a la encuesta y los cursos que se evalúan. El modelo se aplica a un conjunto de datos obtenido de una encuesta de percepción del desempeño del profesor realizada por la Facultad de Ciencias de la Universidad Nacional de Colombia a sus estudiantes. Los puntajes del profesor que se obtienen como resultados del modelo son valores numéricos reales que se pueden usar para calcular estadísticas comunes en la evaluación del profesor. En este caso, las estadísticas son matemáticamente consistentes. Se muestra que algunos de ellos ilustran la utilidad del modelo.

References

Abrami, P. C., Perry, R. P. & Leventhal, L. (1982), ‘The relationship between student personality characteristics, teacher ratings, and student achievement.’, Journal of Educational Psychology 74(1), 111. DOI: https://doi.org/10.1037/0022-0663.74.1.111

Ariyo, O., Quintero, A., Muñoz, J., Verbeke, G. & Lesaffre, E. (2019), ‘Bayesian model selection in linear mixed models for longitudinal data’, Journal of Applied Statistics pp. 1–24. DOI: https://doi.org/10.1080/02664763.2019.1657814

Baker, F. B. & Kim, S. H. (2004), Item Response Theory, 2nd edn, Marcel Decker Inc. DOI: https://doi.org/10.1201/9781482276725

Barkaoui, K. (2014), Multifaceted Rasch analysis for test evaluation, Chichester, UK: Wiley, pp. 1301–1322. DOI: https://doi.org/10.1002/9781118411360.wbcla070

Bartholomew, D., Knott, M. & Moustaki, I. (2011), Latent Variable Models and Factor Analysis. A Unified Approach, third edn, Wiley. DOI: https://doi.org/10.1002/9781119970583

Basow, S. A. & Silberg, N. T. (1987), ‘Student evaluations of college professors: Are female and male professors rated differently?’, Journal of educational psychology 79(3), 308. DOI: https://doi.org/10.1037/0022-0663.79.3.308

Becker, W. E. & Watts, M. (1999), ‘How departments of economics evaluate teaching’, American Economic Review 89(2), 344–349. DOI: https://doi.org/10.1257/aer.89.2.344

Bélanger, C. H. & Longden, B. (2009), ‘The effective teacher’s characteristics as perceived by students’, Tertiary Education and Management 15(4), 323–340. DOI: https://doi.org/10.1080/13583880903335456

Birnbaum, A. (1968), Statistical Theories of mental test Scores, Reading, MA: Addison Wesley, chapter Trait models and their use in infering an examinee’s ability.

Bock, R. D. (1997), ‘A brief history of item response theory’, Educational Measurement: Issues and Practice 16(4), 21–32. DOI: https://doi.org/10.1111/j.1745-3992.1997.tb00605.x

Box, G. E. (1980), ‘Sampling and bayes’ inference in scientific modelling and robustness’, Journal of the Royal Statistical Society: Series A (General) 143(4), 383–404. DOI: https://doi.org/10.2307/2982063

Braga, M., Paccagnella, M. & Pellizzari, M. (2014), ‘Evaluating students’ evaluations of professors’, Economics of Education Review 41, 71–88. DOI: https://doi.org/10.1016/j.econedurev.2014.04.002

Cameletti, M. & Caviezel, V. (2012), ‘The cronbach-mesbah curve for assessing the unidimensionality of an item set: The r package cmc’.

Centra, J. A. (1993), Reflective Faculty Evaluation: Enhancing Teaching and Determining Faculty Effectiveness. The Jossey-Bass Higher and Adult Education Series., ERIC.

Centra, J. A. & Creech, F. R. (1976), The relationship between student teachers and course characteristics and student ratings of teacher effectieness, in ‘Project Report’, Princeton, NJ, Educational Testing Service, pp. 76–1.

Cohen, P. A. (1981), ‘Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies’, Review of educational Research 51(3), 281–309. DOI: https://doi.org/10.3102/00346543051003281

Cordoba, K. (2020), Un modelo tri de múltiples facetas para la evaluación del desempeño docente en el aula, Master’s thesis, Universidad Nacional de Colombia.

Cronbach, L. J. (1951), ‘Coefficient alpha and the internal structure of tests’, Psychometrika 16, 297–334. DOI: https://doi.org/10.1007/BF02310555

Eckes, T. (2015), Introduction to Many-Facet Rash Measurement. Analyzing and Evaluating Rater-Mediated Assesments, second edn, Peter Lang Edition.

Engelhard, G. (2002), Monitoring raters in performance assessment, Mahwah, NJ: Erlbaum., pp. 261–287.

Engelhard, G. (2013), Invariant measurement: Using Rasch models in the social, behavioral, and health sciences, New York, NY: Routledge. DOI: https://doi.org/10.4324/9780203073636

Feldman, K. A. (1977), ‘Consistency and variability among college students in rating their teachers and courses: A review and analysis’, Research in Higher Education 6(3), 223–274. DOI: https://doi.org/10.1007/BF00991288

Feldman, K. A. (1978), ‘Course characteristics and college students’ ratings of their teachers: What we know and what we don’t’, Research in Higher Education 9(3), 199–242. DOI: https://doi.org/10.1007/BF00976997

Feldman, K. A. (1979), ‘The significance of circumstances for college students’ ratings of their teachers and courses’, Research in Higher Education 10(2), 149–172. DOI: https://doi.org/10.1007/BF00976227

Feldman, K. A. (1983), ‘Seniority and experience of college teachers as related to evaluations they receive from students’, Research in Higher Education 18(1), 3–124. DOI: https://doi.org/10.1007/BF00992080

Feldman, K. A. (1987), ‘Research productivity and scholarly accomplishment of college teachers as related to their instructional effectiveness: A review and exploration’, Research in higher education 26(3), 227–298. DOI: https://doi.org/10.1007/BF00992241

Feldman, K. A. (1989), ‘The association between student ratings of specific instructional dimensions and student achievement: Refining and extending the synthesis of data from multisection validity studies’, Research in Higher education 30(6), 583–645. DOI: https://doi.org/10.1007/BF00992392

Gelfand, A. E., Dey, D. K. & Chang, H. (1992), Model determination using predictive distributions with implementation via sampling-based methods, Technical report, Stanford University CA Department of statistics.

Gelman, A., Hwang, J. & Vehtari, A. (2014), ‘Understanding predictive information criteria for bayesian models’, Statistics and computing 24(6), 997–1016. DOI: https://doi.org/10.1007/s11222-013-9416-2

Gelman, A., Meng, X.-L. & Stern, H. (1996), ‘Posterior predictive assessment of model fitness via realized discrepancies’, Statistica sinica pp. 733–760.

Hoffman, M. D. & Gelman, A. (2014), ‘The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo.’, Journal of Machine Learning Research 15(1), 1593–1623.

Jollife, I. (2002), Principal Component Analysis, 2nd edn, Springer.

Koushki, P. A. & Kunh, H. A. J. (1982), ‘How realiable are student evaluations of teachers?’, Engineering Education 72, 362–367.

Linacre, J. M. (1989), Many-facet Rasch measurement, Chicago: MESA Press.

Lord, F. & Novick, M. (2013), Statistical Theories of Mental Test Scores, Addison-Wesley Publishing Company.

Luo, Y. & Jiao, H. (2018), ‘Using the stan program for bayesian item response theory’, Educational and psychological measurement 78(3), 384–408. DOI: https://doi.org/10.1177/0013164417693666

Marsh, H. W. (1987), ‘Students’ evaluations of university teaching: Research findings, methodological issues, and directions for future research’, International journal of educational research 11(3), 253–388. DOI: https://doi.org/10.1016/0883-0355(87)90001-2

Marsh, H. W. (2007), Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential biases and usefulness, in ‘The scholarship of teaching and learning in higher education: An evidence-based perspective’, Springer, pp. 319–383. DOI: https://doi.org/10.1007/1-4020-5742-3_9

Martin, E. (1984), ‘Power and authority in the classroom: Sexist stereotypes in teaching evaluations’, Signs: Journal of Women in Culture and Society 9(3), 482–492. DOI: https://doi.org/10.1086/494073

Murray, H. G. (2005), Student evaluation of teaching: Has it made a difference, in ‘Annual Meeting of the Society for Teaching and Learning in Higher Education. Charlottetown, Prince Edward Island’.

Neal, R. (2011), MCMC using Hamiltonian dynamics in Handbook of Markov Chain Monte Carlo, New York, NY: CRC Press., pp. 113–162. DOI: https://doi.org/10.1201/b10905-6

Perry, R. P., Niemi, R. R. & Jones, K. (1974), ‘Effect of prior teaching evaluations and lecture presentation on ratings of teaching performance.’, Journal of Educational Psychology 66(6), 851. DOI: https://doi.org/10.1037/h0021527

Small, A. C., Hollenbeck, A. R. & Haley, R. L. (1982), ‘The effect of emotional state on student ratings of instructors’, Teaching of Psychology 9(4), 205–211. DOI: https://doi.org/10.1207/s15328023top0904_3

Spencer, P. A. & Flyr, M. L. (1992), ‘The formal evaluation as an impetus to classroom change: Myth or reality?.’.

Stan Development Team (2020a), ‘RStan: the R interface to Stan’. R package version 2.19.3. http://mc-stan.org/

Stan Development Team (2020b), ‘Stan language reference manual’. Version 2.22. http://mc-stan.org

Stan Development Team (2020c), ‘Stan user’s guide’. Version 2.22. http://mc-stan.org

Stark, P. & Freishtat, R. (2014), ‘An evaluation of course evaluations’, Science Open Research . DOI: https://doi.org/10.14293/S2199-1006.1.SOR-EDU.AOFRQA.v1

Uttl, B., Eche, A., Fast, O., Mathison, B., Valladares Montemayor, H. & Raab, V. (2012), ‘Student evaluation of instruction/teaching (sei/set) review’, Calgary, AB, Canada: Mount Royal Faculty Association Retrieved from: http://mrfa.net/files/MRFA_SEI_Review_v6. pdf .

Uttl, B., White, C. A. & Gonzalez, D. W. (2017), ‘Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related’, Studies in Educational Evaluation 54, 22–42. DOI: https://doi.org/10.1016/j.stueduc.2016.08.007

Vehtari, A., Gelman, A. & Gabry, J. (2017), ‘Practical bayesian model evaluation using leave-one-out cross-validation and waic’, Statistics and computing 27(5), 1413–1432. DOI: https://doi.org/10.1007/s11222-016-9696-4

Wachtel, H. K. (1998), ‘Student evaluation of college teaching effectiveness: A brief review’, Assessment & Evaluation in Higher Education 23(2), 191–212. DOI: https://doi.org/10.1080/0260293980230207

Watanabe, S. (2010), ‘Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory’, Journal of Machine Learning Research 11(Dec), 3571–3594.

How to Cite

APA

Cordoba Perozo, K. R. and Montenegro Diaz, A. M. (2021). Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom. Revista Colombiana de Estadística, 44(2), 385–412. https://doi.org/10.15446/rce.v44n2.89661

ACM

[1]
Cordoba Perozo, K.R. and Montenegro Diaz, A.M. 2021. Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom. Revista Colombiana de Estadística. 44, 2 (Jul. 2021), 385–412. DOI:https://doi.org/10.15446/rce.v44n2.89661.

ACS

(1)
Cordoba Perozo, K. R.; Montenegro Diaz, A. M. Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom. Rev. colomb. estad. 2021, 44, 385-412.

ABNT

CORDOBA PEROZO, K. R.; MONTENEGRO DIAZ, A. M. Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom. Revista Colombiana de Estadística, [S. l.], v. 44, n. 2, p. 385–412, 2021. DOI: 10.15446/rce.v44n2.89661. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/89661. Acesso em: 10 jul. 2024.

Chicago

Cordoba Perozo, Karen Rosana, and Alvaro Mauricio Montenegro Diaz. 2021. “Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom”. Revista Colombiana De Estadística 44 (2):385-412. https://doi.org/10.15446/rce.v44n2.89661.

Harvard

Cordoba Perozo, K. R. and Montenegro Diaz, A. M. (2021) “Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom”, Revista Colombiana de Estadística, 44(2), pp. 385–412. doi: 10.15446/rce.v44n2.89661.

IEEE

[1]
K. R. Cordoba Perozo and A. M. Montenegro Diaz, “Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom”, Rev. colomb. estad., vol. 44, no. 2, pp. 385–412, Jul. 2021.

MLA

Cordoba Perozo, K. R., and A. M. Montenegro Diaz. “Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom”. Revista Colombiana de Estadística, vol. 44, no. 2, July 2021, pp. 385-12, doi:10.15446/rce.v44n2.89661.

Turabian

Cordoba Perozo, Karen Rosana, and Alvaro Mauricio Montenegro Diaz. “Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom”. Revista Colombiana de Estadística 44, no. 2 (July 12, 2021): 385–412. Accessed July 10, 2024. https://revistas.unal.edu.co/index.php/estad/article/view/89661.

Vancouver

1.
Cordoba Perozo KR, Montenegro Diaz AM. Bayesian Multi-Faceted TRI Models for Measuring Professor’s Performance in the Classroom. Rev. colomb. estad. [Internet]. 2021 Jul. 12 [cited 2024 Jul. 10];44(2):385-412. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/89661

Download Citation

CrossRef Cited-by

CrossRef citations0

Dimensions

PlumX

Article abstract page views

254

Downloads

Download data is not yet available.