Elsevier

Computers & Education

Volume 50, Issue 3, April 2008, Pages 807-820
Computers & Education

Evaluation of keyphrase extraction algorithm and tiling process for a document/resource recommender within e-learning environments

https://doi.org/10.1016/j.compedu.2006.08.012Get rights and content

Abstract

The research presented in this paper is an examination of the applicability of IUI techniques in an online e-learning environment. In particular we make use of user modeling techniques, information retrieval and extraction mechanisms and collaborative filtering methods. The domains of e-learning, web-based training and instruction and intelligent tutoring systems provide a challenging environment due to the large and diverse user population it entails. The overall system concentrates on utilizing a user modeling system to filter results as part of a collaborative document recommendation system. The goal of such a system is to actively seek out and recommend documents that will either encourage the users to expand their knowledge of a given topic or reinforce the knowledge which they already have. The system aims to recommend these documents in a non-intrusive manner with minimal user inconvenience, and attempts to do so by utilizing the Key Extraction Algorithm and automatically extracting queries, searching the web and filtering the search results. Users are encouraged to provide feedback about the resources and links they have viewed.

Introduction

The motivation for the research described in this paper, was to investigate the probability of incorporating aspects of artificial intelligence (AI) and techniques from domains like intelligent user interfaces (IUI) into the development on e-learning environments. The overall goal was to develop an online learning environment that offered users additional support without restricting the system to any specific domain. The initial aim of the system was to augment the user interface of an existing e-learning environment with relevant aspects from the domain of intelligent user interface research. The initial motivation was to develop a student modeling system based on the stereotyping approach to user modeling. This student modeling component would then be used as the basis, which can offer each individual student unique support during their interaction with the system.

One problem with learning in an online environment is that although there is a wealth of information available to users in an online scenario on the world wide web, crawling the web to find this information can at times be difficult. One solution is to use a search engine like Google (Brin & Page, 1998) to reduce the search space to a more manageable size. Not all students though are capable of creating the appropriate queries to pass to a search engine in order to find related documents. And even if they do, the reduced search space is still extremely large and difficult to navigate. The first system built upon the user modeling system was a document analysis and retrieval system, to help users build queries and retrieve documents related to their current lesson.

Even if the user is capable of forming an appropriate query and navigating the search results to find appropriate documents, there are no assurances as to the quality, accuracy and reliability of the information contained within. Although there is a wealth of information available to each student on the internet, much of the information available is unreliable or inaccurate. This is as a direct result of the uncontrolled and unregulated growth of the world wide web. A collaborative recommendation system was built for this purpose with query extraction mechanisms.

User modeling systems have already been implemented in intelligent tutoring systems and e-learning environments to alter the content and presentation format of the material presented to students (Brusilovsky et al., 1996, De Bra et al., 2003). In the majority of cases this material is chosen from one of the many different versions contained within the system itself. The system attempts to approximate and automate some of the collaborate recommendations that groups of student pass around in relation to useful books, web-sites and other resources. The overall idea of our system is to examine the content that the user is currently viewing and to automatically extract some of the keywords and phrases that are then used as the basis for a query search. The results of this search are then filtered through any relevant user feedback from the current user and their peers. The feedback of the users that are more similar to the current user is weighted more heavily that more distant users.

In order to construct elements to provide personalized and individual support to each user, it was first necessary to build a system which provided the functionality to reason about users and their abilities, knowledge and preferences. Within this user modeling system (Kilbride & Mangina, 2004), the main objective was to develop a system which analyzed each document as the student views it and automatically query a search engine such as Google for recommendations to present to the users. This was achieved by adapting the Keyphrase Extraction Algorithm (KEA) (Witten, Paynter, Frank, Gutwin, & Nevill-Manning, 1999). The next objective was to develop a system for filtering out those links returned from the search engine which were either unreliable, irrelevant or unsuitable. This was achieved by developing a clustering mechanism to group similar users into groups, and a feedback system which registers each users’ rating of the documents which have been presented to them. Documents that are then returned from Google are first filtered through the document recommender based on the ratings other users gave them, and that users position in relation to the class. This work is described in detail in the remainder of this paper and it is structured as follows: Section 2 provides a description of the document/resource recommender system developed for Moodle e-learning environment including the steps that have been followed in the document recommendation process. Section 3 presents KEA and the experimental results found from its application. Section 4 presents the evaluation of the results and findings from the utilization of KEA within the recommender system. Section 4 details the results of a thorough analysis of the performance of the keyphrase extraction and tiling algorithms. Finally, the conclusions and future developments of this work are discussed in the last section.

Section snippets

The document/resource recommender

The collaborative filtering and document recommendation system is built utilizing KEA (Witten et al., 1999, Frank et al., 1999) and the user modeling system (Kilbride & Mangina, 2004). The first stage of the extraction mechanism involves using one of the training data sets to construct a keyphrase extraction model, which will then be employed to extract keyphrases from the documents and resources as students view them through Moodle (Dougiamas & Taylor, 2003). The training data consist of

The keyphrase extraction algorithm

In order to make recommendations, the system utilizes the KEA package. KEA was originally developed as a means of automatic document summarization and clustering. In this case the training data has been altered to allow KEA extract keyphrases that are more suited to the web search domain. Witten et al. describe keyphrases as:

“Keyphrases give a high-level description of a document’s contents that is intended to make it easy for prospective readers to decide whether or not it is relevant for

Results and findings

This section shows the output for a sample iteration of the keyphrase extraction algorithm. The example shown illustrates the level of similarity inherited in the keyphrases extracted by KEA and demonstrates the effectiveness of iterative tiling in terms of reducing the overlap between the various keyphrases and promoting those which occur in several keyphrases in various morpholigical forms to the top of the list.

One of the courses run on the Moodle server over the past year was ‘An

Analysis

This section details the results of a thorough analysis of the performance of the keyphrase extraction and tiling algorithms and how they perform over a selection of file sizes. The graphs below show the time consumption of each of the individual phases of the keyphrase extraction process. The runtime of the algorithm depends primarily on the size of the file from which the keyphrases are to be extracted and this is reflected in each of the graphs below. Once the keyphrases have been extracted

Conclusion and future work

In order to construct elements to provide personalized and individual support to users within e-learning environment, our system has provided support to analyze each document as the student views it and automatically query a search engine for recommendations to present to users. Within this work a detailed examination is provided of the costs of running a document recommender system utilizing KEA and tiling algorithm within an e-learning environment. These results confirm that the document

References (13)

  • S. Brin et al.

    The anatomy of a large-scale hypertextual Web search engine

    Journal of Computer Networks and ISDN Systems

    (1998)
  • Brusilovsky, P., Schwarz, E., & Weber, G. (1996). A tool for developing hypermedia-based its on WWW. In Proceedings of...
  • De Bra, P., Santic, T., & Brusilovsky, P. (2003). AHA! meets Interbook, and more … In Proceedings of the world...
  • P. Domingos et al.

    On the optimality of the simple Bayesian classifier under zero-one loss

    Journal of Machine Learning

    (1997)
  • Dougiamas, M., & Taylor, P. C. (2003). Moodle: Using learning communities to create an open source course management...
  • S. Dumais et al.

    Web question answering: Is more always better?

There are more references available in the full text version of this article.

Cited by (32)

  • Extending web-based educational systems with personalised support through User Centred Designed recommendations along the e-learning life cycle

    2014, Science of Computer Programming
    Citation Excerpt :

    From the review carried out to the aforementioned 59 systems, we found 16 publications that do not report any evaluation at all [19,52,53,55,62,63,65,82,88,90,94,95,98,101,107,108], although in most cases the evaluation was mentioned as future work. Following the categorisation approach proposed in [46] for the other 43 works, the evaluation focus of half of them (27) was put only on evaluating the algorithms [54,56,57,59–61,64,66,68,69,72,73,76,78,80,83,84,86,89,91,93,96,99,100,102,103,105]. The system usage was evaluated in 8 works [51,71,74,77,79,81,92,104], half of them focused on usability issues [71,79,81,104].

  • A novel approach to hybrid recommendation systems based on association rules mining for content recommendation in asynchronous discussion groups

    2013, Information Sciences
    Citation Excerpt :

    Marivate el al. present a system to recommend training courses to professional engineers based on their interests, experiences and goals and the keywords of contents [33] as well. In addition, different examples of recommender systems for e-learning environments can be seen in the studies published by Abel et al. [1], Khribi et al. [22], Li et al. [25], Mangina and Kilbride [31], Zhuhadar et al. [58], Zaı¨ane [54]. In this section, the theories related to the collaborative, content-based, and hybrid recommender systems will be primarily introduced.

  • The effectiveness of automatic text summarization in mobile learning contexts

    2013, Computers and Education
    Citation Excerpt :

    Most of the previous work in automatic text summarization has focused on extractive summarization. These technologies use cue phrase, important sentences and sentence positions in rhetorical role identification (Munoz & Atkinson, 2013), key phrase extraction (Mangina & Kilbride, 2008), and lexical occurrence statistics and discourse structure (Louis, Joshi, & Nenkova, 2010). Others have used statistical-based approaches to produce summaries.

  • The design and implementation of a meaningful learning-based evaluation method for ubiquitous learning

    2011, Computers and Education
    Citation Excerpt :

    However, how do we develop high quality u-learning environments? Many researchers have suggested that evaluation is the way to improve the quality of technology-supported learning environments (Mangina & Kilbride, 2008; Martínez-Torres et al., 2008; Oral, 2008; Toral et al., 2007). According to some research on u-learning evaluation, the application of u-learning is helpful to increase learning effects (El-Bishouty et al., 2007; Huang, Huang et al., 2008; Huang, Kuo et al., 2008; Yang, 2006).

  • Text mining in education

    2019, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
View all citing articles on Scopus

Funded from Higher Education Authority, Ireland.

View full text