Elsevier

Information Sciences

Volume 447, June 2018, Pages 169-185
Information Sciences

Mining temporal characteristics of behaviors from interval events in e-learning

https://doi.org/10.1016/j.ins.2018.03.018Get rights and content

Abstract

Much of the work in the data mining community mines temporal knowledge based primarily on the frequency of events, e.g., frequent pattern mining, ignoring their duration. This paper discusses a method that mines big learning data by taking both the frequency and duration into account. It defines a function for evaluating the importance of events, summarizing them into big uniform events (BUEs) according to the semantics, and further segmenting the BUEs using a sliding window to avoid the counting bias issue. The task of finding temporal characteristics is eventually reduced to mining complex temporally frequent patterns and association rules. To validate this method, a series of extensive experiments are conducted on both synthetic and real datasets to test the system overhead, quality of patterns, and model parameters. The results show that our mining framework is serviceable and can effectively improve the quality of patterns.

Introduction

Information technology has changed the way in which people live and work. In addition, it has a significant influence on the educational domain. Currently, e-learning plays an increasingly important role. Most e-learning systems are capable of keeping detailed logs of user interactions, including keyboard clicking, eye tracking, and video browsing. These data create new opportunities for learning how students behave.

As an operation occurs, an e-learning system instantly records the corresponding interactive event. An event corresponds to a specified event type, which usually has a starting point, an end point, and a list of attributes that describe the event [39]. Educators may need to find the temporal characteristics of individuals’ behaviors to gain further insight into their learning habits, preferences, and cognitive efforts over time [4]. However, this task is not easy to accomplish, as we typically are not able to obtain obvious cues from massive and fragmented events. These cues include detecting important events (IEs) and their temporal relations. These IEs and relations are both desired because the former represent particular preferences and habits, while the latter represent certain causal associations or temporal patterns. This paper aims to provide knowledge to system designers, teachers, leaders, and students that enables them to understand how individuals behave over time; moreover, it seeks to provide, for the first time, evidence supporting the promotion of certain IEs and temporal relations in human-computer interaction design.

The temporal characteristics characterize not only when and what type of behavior a student engages in but also cases in which behaviors change. A simple example is the case of video-viewing behavior [7], [8]. The authors used limited video clickstream events such as play and pause. Thus, the temporal characteristics were easy to obtain, as the play event indicates the start of a cognitive activity, while the stop event indicates the termination of an activity. The temporal characteristics may simply display play-stop loops or something similar to play-play-stop with different durations. One learns that a student stops watching after three seconds or is probably searching for something as he triggers multiple consecutive play events. Moreover, one learns that a student permanently stops watching a video because of disinterest when seven loops occur in succession. In a more complex example, we assume that there are many more events than those considered in the above scenario. Suppose a student browses objects (maybe a video, a PPT, or a structured site) in parallel, as in Fig. 1. He may switch between them, browse multiple times through different parts of one object intermittently, or leave for non-overlapping and uneven temporal durations due to various cognitive demands. Is it possible to address the temporal characteristics of this complex scenario? In other words, can we determine whether an event is important, what the temporal pattern is, and how to characterize it? Unfortunately, we have not found a direct and effective approach to answer these questions.

The traditional approaches treat the groups of consecutive events as time-ordered sequences and discover frequent patterns and association rules using the sequential pattern mining technique [26]. They aim to find those item sets whose occurrences exceed a pair of user-defined support and confidence thresholds. However, in e-learning domains, people may be interested in discovering not only frequent events but also meaningful events, i.e., IEs. In this case, using the foregoing methods may not lead to satisfactory results, as one cannot judge an event’s significance based on the frequency of individual operations. Rather, the duration of these operations may be useful. On the one hand, some events with a low frequency and long duration may be more valuable than those with high frequency and short duration, as they reflect the effectiveness and significance of learning activities to some extent. For example, in the guided example, B is practically more valuable than C, even if the occurrence of C is apparently higher than that of B. Another fact is that we cannot simply ignore frequent events of short duration, which we refer to as tiny-interval events, because we have no idea whether they are pedagogically meaningless or fragments of IEs. Take C for example; we may also regard it as important despite its short duration.

Therefore, the focus of this paper is to explore a method for discovering temporal characteristics of interest from interval-based events with consideration of both event frequency and duration. To the best of our knowledge, most works on temporal data mining do not take into account event frequency and duration simultaneously. It is necessary to identify important events and exclude irrelevant events. Thus, events are first summarized according to the semantics and further segmented into equal-sized and non-overlapping pieces. The task of finding temporal characteristics is addressed by mining complex temporally frequent patterns and association rules. The major contributions are as follows:

  • An evaluation method for effectively identifying events from large-scale event streams by taking both the frequency and duration into account is proposed.

  • A complete mining framework for obtaining temporal characteristics of interest is proposed, and the educational implications of the results are analyzed.

  • To evaluate the performance and practicability of the proposed methods, a series of extensive experiments are conducted on both synthetic and real datasets. The results reveal acceptable system overhead and satisfactory quality of patterns.

The remainder of the paper is organized as follows. Related works are discussed in Section 2. Section 3 introduces the preliminaries and research framework. An algorithm for efficiently processing interval-based temporal data is discussed in Section 4. In Section 5, algorithms for discovering the temporal characteristics of interest are presented. Section 6 demonstrates the performance and serviceability of the algorithms, presents the results of the experiments, and provides a discussion on interesting topics. Finally, conclusions are drawn in Section 7.

Section snippets

Educational temporal mining

Educational data mining (EDM), or learning analytics (LA), uses the statistics, machine learning and data mining techniques to analyze data that are generated from the interaction of teaching and learning to discover educational issues, better understand the states of students, and determine how students adapt to different contexts. As a new area of research, EDM has attracted more and more attention in recent years. The research can be roughly divided into three categories: 1) grouping such

Preliminaries

Definition 1

(Interactive event and interactive log). An interactive event e is caused by a user who interacts with the system. Let P={t1,t2,tn} be a set of primitive time units and e=(s,ti), where s is the type of event and ti, i ∈ [1, n], is a timestamp marked when an event occurs. An interactive log is a set of interactive events logged chronologically.

Definition 2

(Event type and event sequence). An event corresponds to a specified event type and explains the meaning behind a user’s operation. An event e during a

Evaluation of events

A subsequence may contain several event types. The question is how to discriminate the final types if the events are semantically relevant. In this work, this is carried out based on the relative importance of events instead of their frequency. The importance is a function of an event’s density and the relative intensity of the semantics conveyed over its duration. We use a formalism RS = F(d, r) to model their relationships, where d represents an event’s density and r denotes the relative

Mining temporal characteristics

In this section, we discover the temporal characteristics of individuals’ behaviors based on SETI and present an effective graph model, namely, a temporal-event graph (TEG). The temporal characteristics indicate not only when and what kind of behaviors a student performed but also the cases in which the behaviors would change. Therefore, a TEG is a graph showing both the IEs and temporal information. We use nodes to denote events of interest, directed edges to denote the evolution of events

Experimental evaluation

We conduct a series of extensive experiments on both synthetic and real datasets. Our goal is to evaluate the performance of the proposed framework and verify the serviceability in an e-learning environment. All the experiments are scripted using JAVA and performed on a 2.20 GHz machine with 4 GB of memory and the Windows 7 64-bit operating system.

Conclusions

Temporal data provide an alternative way to analyze student behaviors. Most temporal data mining algorithms mainly address the frequencies of events instead of their duration and their temporal indications. One might not find knowledge of interest when treating the times at which events occur as numerical values because the frequency of occurrence is a flawed indicator of the validity of events.

A framework was proposed in this paper to mine the temporal characteristics of individuals’ behaviors

References (51)

  • J.S. Yoo et al.

    Similarity-profiled temporal association mining

    IEEE Trans. Knowl. Data Eng.

    (2009)
  • M.J. Zaki et al.

    New algorithms for fast discovery of association rules

    KDD

    (1997)
  • J.F. Allen

    Maintaining knowledge about temporal intervals

    Commun. ACM

    (1983)
  • P.M. Ashok Kumar et al.

    Anomalous event eetection in traffic video based on sequential temporal patterns of spatial interval events

    KSII Trans. Internet Inf. Syst.

    (2015)
  • R. Azevedo

    Issues in dealing with sequential and temporal characteristics of self- and socially-regulated learning

    Metacogn. Learn.

    (2014)
  • J. de Boer

    Using log files from streaming media servers for optimising the learning sequence

    Int. J.f Continu. Eng. Edu. Life Long Learn.

    (2010)
  • C.G. Brinton et al.

    Mining mooc clickstreams: video-watching behavior vs. in-video quiz performance

    IEEE Trans. Signal Process.

    (2016)
  • C.G. Brinton et al.

    Mooc performance prediction via clickstream data and social learning networks

    Computer Communications (INFOCOM), 2015 IEEE Conference on

    (2015)
  • Y.-C. Chen et al.

    An efficient algorithm for mining time interval-based patterns in large database

    Proceedings of the 19th ACM International Conference on Information and Knowledge Management

    (2010)
  • Y.C. Chen et al.

    Mining temporal patterns in time interva-based data

    IEEE Trans. Knowl. Data Eng.

    (2015)
  • Q. Cohen-Solal et al.

    An algebra of granular temporal relations for qualitative reasoning

    IJCAI

    (2015)
  • A.V. Deokar et al.

    Semantics-based event log aggregation for process mining and analytics

    Inf. Syst. Front.

    (2015)
  • J. Gao et al.

    A graph-based consensus maximization approach for combining multiple supervised and unsupervised models

    IEEE Trans. Knowl. Data Eng.

    (2013)
  • J.A. GREENE et al.

    The measurement of learners self-regulated cognitive and metacognitive processes while using computer-based learning environments

    Edu. Psychol.

    (2010)
  • J. Han et al.

    Efficient mining of partial periodic patterns in time series database

    Data Engineering, 1999. Proceedings., 15th International Conference on

    (1999)
  • Cited by (13)

    • Event evolution model for cybersecurity event mining in tweet streams

      2020, Information Sciences
      Citation Excerpt :

      The discovered author interest information, in turn, was proven to improve topic extraction accuracy. Topic evolution modeling can be extended to the temporal evolution of students’ behaviours in e-learning systems [32], with the linguistic structure in the system logs keeping user interactions. In this section, we first introduce the concept of critical domain relevant patterns, regarded as basic data points for clustering cybersecurity events.

    • Systematic Review and Analysis of EDM for Predicting the Academic Performance of Students

      2024, Journal of The Institution of Engineers (India): Series B
    • A Review of Data Mining in Education Sector

      2022, Journal of Engineering Education Transformations
    View all citing articles on Scopus

    This research was partially supported by the MOE-China Mobile Research Fund Project No. MCM20160405, the Fundamental Research Fund for the Central Universities of MOE No. SWU118006, the MOE Innovation Research Team No. IRT13035, the Coordinator Innovation Project for the Key Lab of Shaanxi Province under Grant No. 2013SZS05-Z01, the Online Education Research Foundation of the MOE Research Center for Online Education under Grant Nos. 2016YB165 and 2016YB169, the Natural Science Basic Research Plan in Shaanxi Province of China Nos. 2016JM6027 and 2016JM6080, and the Project of China Knowledge Centre for Engineering Science and Technology.

    View full text