ABSTRACT
In this paper, we evaluate the complete undergraduate coenrollment network over a decade of education at a large American public university. We provide descriptive properties of the network, demonstrating that the coenrollment networks evaluated follow power-law degree distributions similar to many other large-scale networks; that they reveal strong performance-based assortativity; and that network-based features can significantly improve GPA-based student performance predictors. We then implement a network-based, multi-view classification model to predict students' final course grades. In particular, we adapt a structural modeling approach from [19, 34], whereby we model the university-wide undergraduate coenrollment network as an undirected graph. We compare the performance of our predictor to traditional methods used for grade prediction in undergraduate university courses, and demonstrate that a multi-view ensembling approach outperforms both prior "flat" and network-based models for grade prediction across several classification metrics. These findings demonstrate the usefulness of combining diverse approaches in models of student success, and demonstrate specific network-based modeling strategies which are likely to be most effective for grade prediction.
- Paul Adams. 2006. Exploring social constructivism: theories and practicalities. Education 3--13 34, 3 (Oct. 2006), 243--257.Google Scholar
- J S Antrobus, R Dobbelaer, and S Salzinger. 1988. Social networks and college success, or grade point average and the friendly connection. Social networks of children, adolescents, and college students 227 (1988), 260.Google Scholar
- Sara Baker, Adalbert Mayer, and Steven L Puller. 2011. Do more diverse environments increase the diversity of subsequent interaction? Evidence from random dorm assignment. Econ. Lett. 110, 2 (2011), 110--112.Google ScholarCross Ref
- Timothy T Baldwin, Michael D Bedell, and Jonathan L Johnson. 1997. The social fabric of a team-based MBA program: Network effects on student satisfaction and performance. Acad. Manage. J. 40, 6 (1997), 1369--1397.Google Scholar
- L Becchetti, C Castillo, D Donato, and others. 2008. Link analysis for web spam detection. ACM Transactions on (2008). Google ScholarDigital Library
- James Bennett, Stan Lanning, and Others. 2007. The netflix prize. In Proceedings of KDD cup and workshop, Vol. 2007. brettb.net, 35.Google Scholar
- Smriti Bhagat, Irina Rozenbaum, and Graham Cormode. 2007. Applying Link-based Classification to Label Blogs. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis (WebKDD/SNA-KDD '07). ACM, New York, NY, USA, 92--101. Google ScholarDigital Library
- Susan Biancani and Daniel McFarland. 2013. Social Networks Research in Higher Education. Educ. Stud. 4 (2013), 85--126.Google Scholar
- Giorgio Brunello, Maria de Paola, and Vincenzo Scoppa. 2010. PEER EFFECTS IN HIGHER EDUCATION: DOES THE FIELD OF STUDY MATTER? Econ. Inq. 48, 3 (July 2010), 621--634.Google Scholar
- Soumen Chakrabarti, Byron Dom, and Piotr Indyk. 1998. Enhanced Hypertext Categorization Using Hyperlinks. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD '98). ACM, New York, NY, USA, 307--318. Google ScholarDigital Library
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 785--794. Google ScholarDigital Library
- Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of Interest. In Advances in Intelligent Data Analysis. Springer, Berlin, Heidelberg, 105--114. Google ScholarDigital Library
- Thomas GDietterich. 2000. Ensemble Methods in Machine Learning. In Multiple Classifier Systems. Springer, Berlin, Heidelberg, 1--15. Google ScholarCross Ref
- Pedro Domingos and Matt Richardson. 2001. Mining the Network Value of Customers. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '01). ACM, New York, NY, USA, 57--66. Google ScholarDigital Library
- Tom Fawcett and Foster Provost. 1997. Adaptive Fraud Detection. Data Min. Knowl. Discov. 1, 3 (Sept. 1997), 291--316. Google ScholarDigital Library
- Gigi Foster. 2006. It's not your peers, and it's not your friends: Some progress toward understanding the educational peer effect mechanism. J. Public Econ. 90, 8--9 (2006), 1455--1475.Google ScholarCross Ref
- Nir Friedman, Lise Getoor, Daphne Koller, and Avi Pfeffer. 1999. Learning probabilistic relational models. In IJCAI, Vol. 99. robotics.stanford.edu, 1300--1309. Google ScholarDigital Library
- Dragan Gašević, Amal Zouaq, and Robert Janzen. 2013. "Choose your classmates, your GPA is at stake!" The association of cross-class social ties and academic performance. Am. Behav. Sci. 57, 10 (2013), 1460--1479.Google ScholarCross Ref
- Lise Getoor. 2005. Link-based Classification. In Advanced Methods for Knowledge Discovery from Complex Data. Springer London, 189--207.Google Scholar
- Lise Getoor and Christopher P Diehl. 2005. Link Mining: A Survey. SIGKDD Explor. Newsl. 7, 2 (Dec. 2005), 3--12. Google ScholarDigital Library
- Lise Getoor and Lilyana Mihalkova. 2011. Learning Statistical Models from Relational Data. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11). ACM, New York, NY, USA, 1195--1198. Google ScholarDigital Library
- A F Hadwin and Sanna Järvelä. 2011. Introduction to a special issue on social aspects of self-regulated learning: Where social and self meet in the strategic regulation of learning. Teach. Coll. Rec. 113, 2 (2011), 235--239.Google ScholarCross Ref
- David J Hand and Robert J Till. 2001. A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Mach. Learn. 45, 2 (Nov. 2001), 171--186. Google ScholarDigital Library
- Shawndra Hill, Foster Provost, and Chris Volinsky. 2006. Network-Based Marketing: Identifying Likely Adopters via Consumer Networks. Stat. Sci. 21, 2 (2006), 256--276.Google ScholarCross Ref
- Jessica Hoel, Jeffrey Parker, and Jon Rivenburg. 2005. Peer effects: do first-year classmates, roommates, and dormmates affect studentsâĂŹ academic success. In Higher education data sharing consortium winter conference, Santa Fe, NM. Citeseer.Google Scholar
- Feng Jiang and Wentao Li. 2017. Who Will Be the Next to Drop Out? Anticipating Dropouts in MOOCs with Multi-View Features. International Journal of Performability Engineering 13, 2 (2017).Google Scholar
- Yuheng Jiang and Lukasz Golab. {n. d.}. On Competition for Undergraduate Co-op Placements: A Graph Mining Approach. ({n. d.}).Google Scholar
- David W Johnson and Roger T Johnson. 2009. An Educational Psychology Success Story: Social Interdependence Theory and Cooperative Learning. Educ. Res. 38, 5 (June 2009), 365--379.Google ScholarCross Ref
- Daphne Koller. 1999. Probabilistic Relational Models. In Inductive Logic Programming. Springer, Berlin, Heidelberg, 3--13. Google ScholarDigital Library
- Gueorgi Kossinets and Duncan J Watts. 2006. Empirical analysis of an evolving social network. Science 311, 5757 (Jan. 2006), 88--90.Google ScholarCross Ref
- J Lafferty, A McCallum, and F Pereira. {n. d.}. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ({n. d.}).Google Scholar
- Conrad Lee, Thomas Scherngell, and Michael J Barber. 2011. Investigating an online social network using spatial interaction models. Soc. Networks 33, 2 (2011), 129--133.Google ScholarCross Ref
- W Li, M Gao, H Li, Q Xiong, J Wen, and Z Wu. 2016. Dropout prediction in MOOCs using behavior features and multi-view semi-supervised learning. In 2016 International Joint Conference on Neural Networks (IJCNN). ieeexplore.ieee.org, 3130--3137.Google ScholarCross Ref
- Qing Lu and Lise Getoor. 2003. Link-based classification. In ICML, Vol. 3. vvvvw.aaai.org, 496--503. Google ScholarDigital Library
- Q Lu and L Getoor. 2003. Link-based classification using labeled and unlabeled data. In ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. Google ScholarDigital Library
- David S Lyle. 2007. Estimating and Interpreting Peer and Role Model Effects from Randomly Assigned Social Groups at West Point. Rev. Econ. Stat. 89, 2 (April 2007), 289--299.Google Scholar
- Collin F Lynch, Tiffany Barnes, Linting Xue, and Niki Gitinabard. 2017. Graph-based Educational Data Mining (G-EDM 2017). In Proceedings of the 10th International Conference on Educational Data Mining, Xiangen Hu, Tiffany Barnes, Arnon Hershkovitz, and Luc Paquette (Eds.). 472--473.Google Scholar
- Patrick J McEwan and Kristen A Soderberg. 2006. Roommate Effects on Grades: Evidence from First-Year Housing Assignments. Res. High. Educ. 47, 3 (May 2006), 347--370.Google Scholar
- M Newman. 2003. The Structure and Function of Complex Networks. SIAM Rev. 45, 2 (Jan. 2003), 167--256.Google ScholarDigital Library
- M E Newman. 2001. The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. U. S. A. 98, 2 (Jan. 2001), 404--409.Google ScholarCross Ref
- M E J Newman. 2003. Mixing patterns in networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 67, 2 Pt 2 (Feb. 2003), 026126.Google ScholarCross Ref
- Hyo-Jung Oh, Sung Hyon Myaeng, and Mann-Ho Lee. 2000. A Practical Hypertext Catergorization Method Using Links and Incrementally Available Class Information. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '00). ACM, New York, NY, USA, 264--271. Google ScholarDigital Library
- Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: A Fast and Scalable System for Fraud Detection in Online Auction Networks. In Proceedings of the 16th International Conference on World Wide Web (WWW '07). ACM, New York, NY, USA, 201--210. Google ScholarDigital Library
- Foster Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, and Alan Murray. 2009. Audience Selection for On-line Brand Advertising: Privacy-friendly Social Network Targeting. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '09). ACM, New York, NY, USA, 707--716. Google ScholarDigital Library
- S Redner. 1998. How popular is your paper? An empirical study of the citation distribution. Eur. Phys. J. B 4, 2 (July 1998), 131--134.Google ScholarCross Ref
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. dl.acm.org, 1135--1144. Google ScholarDigital Library
- M Riedmiller and H Braun. 1993. A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In IEEE International Conference on Neural Networks. ieeexplore.ieee.org, 586--591 vol.1.Google ScholarCross Ref
- Bruce Sacerdote. 2000. Peer Effects with Random Assignment: Results for Dartmouth Roommates. (Jan. 2000).Google ScholarCross Ref
- Sarah Schrire. 2006. Knowledge building in asynchronous discussion groups: Going beyond quantitative analysis. Comput. Educ. 46, 1 (Jan. 2006), 49--70. Google ScholarDigital Library
- John J Siegfried and Michael A Gleason. 2006. Academic roommate peer effects. Unpublished manuscript, Vanderbilt Univ., Nashville (2006).Google Scholar
- Ralph Stinebrickner and Todd R Stinebrickner. 2006. What can be learned about peer effects using college roommates? Evidence from new survey data and students from disadvantaged backgrounds. J. Public Econ. 90, 8--9 (2006), 1435--1454.Google ScholarCross Ref
- Shiliang Sun. 2013. A survey of multi-view machine learning. Neural Comput. Appl. 23, 7-8 (Dec. 2013), 2031--2038.Google ScholarCross Ref
- Lei Tang and Huan Liu. 2011. Leveraging social media networks for classification. Data Min. Knowl. Discov. 23, 3 (Nov. 2011), 447--478. Google ScholarDigital Library
- Ben Taskar, Pieter Abbeel, and Daphne Koller. 2002. Discriminative Probabilistic Models for Relational Data. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI'02). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 485--492. Google ScholarDigital Library
- Andreas Töscher, Michael Jahrer, and Robert M Bell. 2009. The bigchaos solution to the netflix grand prize. Netflix prize documentation (2009), 1--52.Google Scholar
- A Traud, E Kelsic, P Mucha, and M Porter. 2011. Comparing Community Structure to Characteristics in Online Collegiate Social Networks. SIAM Rev. 53, 3 (Jan. 2011), 526--543. Google ScholarDigital Library
- Anneleen Van Assche, Celine Vens, Hendrik Blockeel, and Sašo Dzeroski. 2004. A random forest approach to relational learning. In Workshop on Statistical Relational Learning.Google Scholar
- Jacob Whitehill, Kiran Mohan, Daniel Seaton, Yigal Rosen, and Dustin Tingley. 2017. Delving Deeper into MOOC Student Dropout Prediction. (Feb. 2017). arXiv:cs.AI/1702.06404Google Scholar
- Andreas Wimmer and Kevin Lewis. 2010. Beyond and Below Racial Homophily: ERG Models of a Friendship Network Documented on Facebook. Am. J. Sociol. 116, 2 (2010), 583--642.Google ScholarCross Ref
- Gordon Winston and David Zimmerman. 2004. Peer effects in higher education. In College choices: The economics of where to go, when to go, and how to pay for it. University of Chicago Press, 395--424.Google Scholar
- David H Wolpert. 1992. Stacked generalization. Neural Netw. 5, 2 (1992), 241--259. Google ScholarDigital Library
- Chang Xu, Dacheng Tao, and Chao Xu. 2013. A Survey on Multi-view Learning. (April 2013). arXiv:cs.LG/1304.5634Google Scholar
- Zhijie Xu and Shiliang Sun. 2010. An Algorithm on Multi-View Adaboost. In Neural Information Processing. Theory and Algorithms (Lecture Notes in Computer Science). Springer, Berlin, Heidelberg, 355--362. Google ScholarDigital Library
- Tong Zhang, Alexandrin Popescul, and Byron Dom. 2006. Linear Prediction Models with Graph Regularization for Web-page Categorization. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '06). ACM, New York, NY, USA, 821--826. Google ScholarDigital Library
- David J Zimmerman. 2003. Peer Effects in Academic Outcomes: Evidence from a Natural Experiment. Rev. Econ. Stat. 85, 1 (Feb. 2003), 9--23.Google Scholar
Index Terms
- Coenrollment networks and their relationship to grades in undergraduate education
Recommendations
An Improved Grade Point Average, With Applications to CS Undergraduate Education Analytics
Special Issue on Learning Analytics and Regular PapersWe present a methodological improvement for calculating Grade Point Averages (GPAs). Heterogeneity in grading between courses systematically biases observed GPAs for individual students: the GPA observed depends on course selection. We show how a ...
Undergraduate Grade Prediction in Chinese Higher Education Using Convolutional Neural Networks
LAK21: LAK21: 11th International Learning Analytics and Knowledge ConferencePrediction of undergraduate grades before their course enrollments is beneficial to the student’s learning plan on selective courses and failure warnings to compulsory courses in Chinese higher education. This study proposed to use a deep learning-...
How Did You Get that A? Selectivity’s Role in Rising Undergraduate Grades at a Large Public University
LAK21: LAK21: 11th International Learning Analytics and Knowledge ConferenceFor nearly a century, pre-college standardized test scores and undergraduate letter grades have been de facto industry standard measures of achievement in US higher education. We examine a sample of millions of grades and a half million pre-college ...
Comments