skip to main content
10.1145/3539618.3592073acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Towards Robust Knowledge Tracing Models via k-Sparse Attention

Published: 18 July 2023 Publication History

Abstract

Knowledge tracing (KT) is the problem of predicting students' future performance based on their historical interaction sequences. With the advanced capability of capturing contextual long-term dependency, attention mechanism becomes one of the essential components in many deep learning based KT (DLKT) models. In spite of the impressive performance achieved by these attentional DLKT models, many of them are often vulnerable to run the risk of overfitting, especially on small-scale educational datasets. Therefore, in this paper, we propose sparseKT, a simple yet effective framework to improve the robustness and generalization of the attention based DLKT approaches. Specifically, we incorporate a k-selection module to only pick items with the highest attention scores. We propose two sparsification heuristics: (1) soft-thresholding sparse attention and (2) top-K sparse attention. We show that our sparseKT is able to help attentional KT models get rid of irrelevant student interactions and improve the predictive performance when compared to 11 state-of-the-art KT models on three publicly available real-world educational datasets. To encourage reproducible research, we make our data and code publicly available at https://github.com/pykt-team/pykt-toolkit1.

Supplemental Material

MOV File
Presentation video - short version

References

[1]
Ghodai Abdelrahman and Qing Wang. 2019. Knowledge tracing with sequential key-value memory networks. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 175--184.
[2]
Sajjad Amini and Shahrokh Ghaemmaghami. 2022. Towards Robust Visual Transformer Networks via K-Sparse Attention. In ICASSP 2022--2022 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 4053--4057.
[3]
Jiahao Chen, Zitao Liu, Shuyan Huang, Qiongqiong Liu, and Weiqi Luo. 2023. Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations. In Proceedings of the AAAI Conference on Artificial Intelligence.
[4]
Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019).
[5]
Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, and Jaewe Heo. 2020. Towards an appropriate query, key, and value computation for knowledge tracing. In Proceedings of the Seventh ACM Conference on Learning@ Scale. 341--344.
[6]
Aritra Ghosh, Neil Heffernan, and Andrew S Lan. 2020. Context-aware attentive knowledge tracing. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2330--2339.
[7]
Xiaopeng Guo, Zhijie Huang, Jie Gao, Mingyu Shang, Maojing Shu, and Jun Sun. 2021. Enhancing Knowledge Tracing via Adversarial Training. In Proceedings of the 29th ACM International Conference on Multimedia. 367--375.
[8]
Jinseok Lee and Dit-Yan Yeung. 2019. Knowledge query network for knowledge tracing: How knowledge interacts with skills. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge. 491--500.
[9]
Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, Boyu Gao, Weiqi Luo, and Jian Weng. 2023. Enhancing Deep Knowledge Tracing with Auxiliary Tasks. In Proceedings of the ACM Web Conference 2023.
[10]
Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, and Weiqi Luo. 2023. simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing. In The Eleventh International Conference on Learning Representations.
[11]
Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, Jiliang Tang, and Weiqi Luo. 2022. pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
[12]
Ting Long, Yunfei Liu, Jian Shen, Weinan Zhang, and Yong Yu. 2021. Tracing Knowledge State with Individual Cognition and Acquisition Estimation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 173--182.
[13]
Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In International Conference on Machine Learning. PMLR, 1614--1623.
[14]
Hiromi Nakagawa, Yusuke Iwasawa, and Yutaka Matsuo. 2019. Graph-based knowledge tracing: modeling student proficiency using graph neural network. In 2019 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE, 156--163.
[15]
Shalini Pandey and George Karypis. 2019. A self-attentive model for knowledge tracing. In 12th International Conference on Educational Data Mining. International Educational Data Mining Society, 384--389.
[16]
Shalini Pandey and Jaideep Srivastava. 2020. RKT: relation-aware self-attention for knowledge tracing. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 1205--1214.
[17]
Ben Peters, Vlad Niculae, and André FT Martins. 2019. Sparse Sequence-to-Sequence Models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1504--1519.
[18]
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein. 2015. Deep knowledge tracing. Advances in Neural Information Processing Systems 28 (2015).
[19]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).
[20]
Chenyang Wang, Weizhi Ma, Min Zhang, Chuancheng Lv, FengyuanWan, Huijie Lin, Taoran Tang, Yiqun Liu, and Shaoping Ma. 2021. Temporal cross-effects in knowledge tracing. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 517--525.
[21]
ZichaoWang, Angus Lamb, Evgeny Saveliev, Pashmina Cameron, Yordan Zaykov, José Miguel Hernández-Lobato, Richard E Turner, Richard G Baraniuk, Craig Barton, Simon Peyton Jones, et al. 2020. Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge. ArXiv preprint abs/2007.12061 (2020). https://arxiv.org/abs/2007.12061
[22]
Chun-Kit Yeung and Dit-Yan Yeung. 2018. Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In Proceedings of the Fifth Annual ACM Conference on Learning at Scale. 1--10.
[23]
Jiani Zhang, Xingjian Shi, Irwin King, and Dit-Yan Yeung. 2017. Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th International Conference on World Wide Web. 765--774.
[24]
Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, and Xu Sun. 2019. Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637 (2019).

Cited By

View all
  • (2025)csKT: Addressing cold-start problem in knowledge tracing via kernel bias and cone attentionExpert Systems with Applications10.1016/j.eswa.2024.125988266(125988)Online publication date: Mar-2025
  • (2025)A prompt-driven framework for multi-domain knowledge tracingMachine Learning10.1007/s10994-024-06660-6114:4Online publication date: 17-Feb-2025
  • (2024)Enhancing length generalization for attention based knowledge tracing models with linear biasesProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/654(5918-5926)Online publication date: 3-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ai in education
  2. deep learning
  3. knowledge tracing
  4. sparse attention
  5. student modeling

Qualifiers

  • Short-paper

Conference

SIGIR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)234
  • Downloads (Last 6 weeks)31
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)csKT: Addressing cold-start problem in knowledge tracing via kernel bias and cone attentionExpert Systems with Applications10.1016/j.eswa.2024.125988266(125988)Online publication date: Mar-2025
  • (2025)A prompt-driven framework for multi-domain knowledge tracingMachine Learning10.1007/s10994-024-06660-6114:4Online publication date: 17-Feb-2025
  • (2024)Enhancing length generalization for attention based knowledge tracing models with linear biasesProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/654(5918-5926)Online publication date: 3-Aug-2024
  • (2024)Question Difficulty Consistent Knowledge TracingProceedings of the ACM Web Conference 202410.1145/3589334.3645582(4239-4248)Online publication date: 13-May-2024
  • (2024)Interpretable Knowledge Tracing with Multiscale State RepresentationProceedings of the ACM Web Conference 202410.1145/3589334.3645373(3265-3276)Online publication date: 13-May-2024
  • (2024)A Survey of Knowledge Tracing: Models, Variants, and ApplicationsIEEE Transactions on Learning Technologies10.1109/TLT.2024.338332517(1898-1919)Online publication date: 8-Apr-2024
  • (2024)GuessKT: Improving Knowledge Tracing via Considering Guess BehaviorsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447277(12811-12815)Online publication date: 14-Apr-2024
  • (2024)A Narrative Review of Developments in Knowledge Tracing in Last Decade2024 3rd Edition of IEEE Delhi Section Flagship Conference (DELCON)10.1109/DELCON64804.2024.10866076(1-7)Online publication date: 21-Nov-2024
  • (2024)Model-agnostic counterfactual reasoning for identifying and mitigating answer bias in knowledge tracingNeural Networks10.1016/j.neunet.2024.106495178:COnline publication date: 1-Oct-2024
  • (2024)An efficient state-aware Coarse-Fine-Grained model for Knowledge TracingKnowledge-Based Systems10.1016/j.knosys.2024.112375302:COnline publication date: 25-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media