short-paper

Towards Robust Knowledge Tracing Models via k-Sparse Attention

Authors:

Jian WengAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2441 - 2445

https://doi.org/10.1145/3539618.3592073

Published: 18 July 2023 Publication History

Abstract

Knowledge tracing (KT) is the problem of predicting students' future performance based on their historical interaction sequences. With the advanced capability of capturing contextual long-term dependency, attention mechanism becomes one of the essential components in many deep learning based KT (DLKT) models. In spite of the impressive performance achieved by these attentional DLKT models, many of them are often vulnerable to run the risk of overfitting, especially on small-scale educational datasets. Therefore, in this paper, we propose sparseKT, a simple yet effective framework to improve the robustness and generalization of the attention based DLKT approaches. Specifically, we incorporate a k-selection module to only pick items with the highest attention scores. We propose two sparsification heuristics: (1) soft-thresholding sparse attention and (2) top-K sparse attention. We show that our sparseKT is able to help attentional KT models get rid of irrelevant student interactions and improve the predictive performance when compared to 11 state-of-the-art KT models on three publicly available real-world educational datasets. To encourage reproducible research, we make our data and code publicly available at https://github.com/pykt-team/pykt-toolkit1.

Supplemental Material

MOV File

Presentation video - short version

Download
13.29 MB

References

[1]

Ghodai Abdelrahman and Qing Wang. 2019. Knowledge tracing with sequential key-value memory networks. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 175--184.

Digital Library

[2]

Sajjad Amini and Shahrokh Ghaemmaghami. 2022. Towards Robust Visual Transformer Networks via K-Sparse Attention. In ICASSP 2022--2022 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 4053--4057.

[3]

Jiahao Chen, Zitao Liu, Shuyan Huang, Qiongqiong Liu, and Weiqi Luo. 2023. Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations. In Proceedings of the AAAI Conference on Artificial Intelligence.

Digital Library

[4]

Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019).

[5]

Youngduck Choi, Youngnam Lee, Junghyun Cho, Jineon Baek, Byungsoo Kim, Yeongmin Cha, Dongmin Shin, Chan Bae, and Jaewe Heo. 2020. Towards an appropriate query, key, and value computation for knowledge tracing. In Proceedings of the Seventh ACM Conference on Learning@ Scale. 341--344.

Digital Library

[6]

Aritra Ghosh, Neil Heffernan, and Andrew S Lan. 2020. Context-aware attentive knowledge tracing. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2330--2339.

Digital Library

[7]

Xiaopeng Guo, Zhijie Huang, Jie Gao, Mingyu Shang, Maojing Shu, and Jun Sun. 2021. Enhancing Knowledge Tracing via Adversarial Training. In Proceedings of the 29th ACM International Conference on Multimedia. 367--375.

Digital Library

[8]

Jinseok Lee and Dit-Yan Yeung. 2019. Knowledge query network for knowledge tracing: How knowledge interacts with skills. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge. 491--500.

Digital Library

[9]

Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, Boyu Gao, Weiqi Luo, and Jian Weng. 2023. Enhancing Deep Knowledge Tracing with Auxiliary Tasks. In Proceedings of the ACM Web Conference 2023.

Digital Library

[10]

Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, and Weiqi Luo. 2023. simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing. In The Eleventh International Conference on Learning Representations.

[11]

Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, Jiliang Tang, and Weiqi Luo. 2022. pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.

[12]

Ting Long, Yunfei Liu, Jian Shen, Weinan Zhang, and Yong Yu. 2021. Tracing Knowledge State with Individual Cognition and Acquisition Estimation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 173--182.

Digital Library

[13]

Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In International Conference on Machine Learning. PMLR, 1614--1623.

[14]

Hiromi Nakagawa, Yusuke Iwasawa, and Yutaka Matsuo. 2019. Graph-based knowledge tracing: modeling student proficiency using graph neural network. In 2019 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE, 156--163.

Digital Library

[15]

Shalini Pandey and George Karypis. 2019. A self-attentive model for knowledge tracing. In 12th International Conference on Educational Data Mining. International Educational Data Mining Society, 384--389.

[16]

Shalini Pandey and Jaideep Srivastava. 2020. RKT: relation-aware self-attention for knowledge tracing. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 1205--1214.

Digital Library

[17]

Ben Peters, Vlad Niculae, and André FT Martins. 2019. Sparse Sequence-to-Sequence Models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1504--1519.

[18]

Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein. 2015. Deep knowledge tracing. Advances in Neural Information Processing Systems 28 (2015).

[19]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).

[20]

Chenyang Wang, Weizhi Ma, Min Zhang, Chuancheng Lv, FengyuanWan, Huijie Lin, Taoran Tang, Yiqun Liu, and Shaoping Ma. 2021. Temporal cross-effects in knowledge tracing. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 517--525.

Digital Library

[21]

ZichaoWang, Angus Lamb, Evgeny Saveliev, Pashmina Cameron, Yordan Zaykov, José Miguel Hernández-Lobato, Richard E Turner, Richard G Baraniuk, Craig Barton, Simon Peyton Jones, et al. 2020. Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge. ArXiv preprint abs/2007.12061 (2020). https://arxiv.org/abs/2007.12061

[22]

Chun-Kit Yeung and Dit-Yan Yeung. 2018. Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In Proceedings of the Fifth Annual ACM Conference on Learning at Scale. 1--10.

Digital Library

[23]

Jiani Zhang, Xingjian Shi, Irwin King, and Dit-Yan Yeung. 2017. Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th International Conference on World Wide Web. 765--774.

Digital Library

[24]

Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, and Xu Sun. 2019. Explicit sparse transformer: Concentrated attention through explicit selection. arXiv preprint arXiv:1912.11637 (2019).

Cited By

Bai YLi XLiu ZHuang YGuo THou MXia FLuo W(2025)csKT: Addressing cold-start problem in knowledge tracing via kernel bias and cone attentionExpert Systems with Applications10.1016/j.eswa.2024.125988266(125988)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125988
Liu ZHuang SGuo THou MLiang Q(2025)A prompt-driven framework for multi-domain knowledge tracingMachine Learning10.1007/s10994-024-06660-6114:4Online publication date: 17-Feb-2025
https://doi.org/10.1007/s10994-024-06660-6
Li XBai YGuo TLiu ZHuang YZhao XXia FLuo WWeng JLarson K(2024)Enhancing length generalization for attention based knowledge tracing models with linear biasesProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/654(5918-5926)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/654
Show More Cited By

Index Terms

Towards Robust Knowledge Tracing Models via k-Sparse Attention
1. Applied computing
  1. Education
    1. Learning management systems
2. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Student assessment

Recommendations

Enhancing Deep Knowledge Tracing with Auxiliary Tasks
WWW '23: Proceedings of the ACM Web Conference 2023

Knowledge tracing (KT) is the problem of predicting students’ future performance based on their historical interactions with intelligent tutoring systems. Recent studies have applied multiple types of deep neural networks to solve the KT problem. ...
Enhancing Knowledge Tracing via Adversarial Training
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

We study the problem of knowledge tracing (KT) where the goal is to trace the students' knowledge mastery over time so as to make predictions on their future performance. Owing to the good representation capacity of deep neural networks (DNNs), recent ...
Structured Knowledge Tracing Models for Student Assessment on Coursera
L@S '16: Proceedings of the Third (2016) ACM Conference on Learning @ Scale

Massive Open Online Courses (MOOCs) provide an effective learning platform with various high-quality educational materials accessible to learners from all over the world. However, current MOOCs lack personalized learning guidance and intelligent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
579
Total Downloads

Downloads (Last 12 months)234
Downloads (Last 6 weeks)31

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bai YLi XLiu ZHuang YGuo THou MXia FLuo W(2025)csKT: Addressing cold-start problem in knowledge tracing via kernel bias and cone attentionExpert Systems with Applications10.1016/j.eswa.2024.125988266(125988)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125988
Liu ZHuang SGuo THou MLiang Q(2025)A prompt-driven framework for multi-domain knowledge tracingMachine Learning10.1007/s10994-024-06660-6114:4Online publication date: 17-Feb-2025
https://doi.org/10.1007/s10994-024-06660-6
Li XBai YGuo TLiu ZHuang YZhao XXia FLuo WWeng JLarson K(2024)Enhancing length generalization for attention based knowledge tracing models with linear biasesProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/654(5918-5926)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/654
Liu GZhan HKim JChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Question Difficulty Consistent Knowledge TracingProceedings of the ACM Web Conference 202410.1145/3589334.3645582(4239-4248)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645582
Sun JYu FWan QLi QLiu SShen XChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Interpretable Knowledge Tracing with Multiscale State RepresentationProceedings of the ACM Web Conference 202410.1145/3589334.3645373(3265-3276)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645373
Shen SLiu QHuang ZZheng YYin MWang MChen E(2024)A Survey of Knowledge Tracing: Models, Variants, and ApplicationsIEEE Transactions on Learning Technologies10.1109/TLT.2024.338332517(1898-1919)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1109/TLT.2024.3383325
Zu SCai STang WWang CLi LShen J(2024)GuessKT: Improving Knowledge Tracing via Considering Guess BehaviorsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447277(12811-12815)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10447277
Jain AMago N(2024)A Narrative Review of Developments in Knowledge Tracing in Last Decade2024 3rd Edition of IEEE Delhi Section Flagship Conference (DELCON)10.1109/DELCON64804.2024.10866076(1-7)Online publication date: 21-Nov-2024
https://doi.org/10.1109/DELCON64804.2024.10866076
Cui CMa HDong XZhang CZhang CYao YChen MMa Y(2024)Model-agnostic counterfactual reasoning for identifying and mitigating answer bias in knowledge tracingNeural Networks10.1016/j.neunet.2024.106495178:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.neunet.2024.106495
Luo HZhang ZCui LZhang ZLiang Y(2024)An efficient state-aware Coarse-Fine-Grained model for Knowledge TracingKnowledge-Based Systems10.1016/j.knosys.2024.112375302:COnline publication date: 25-Oct-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.112375
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten