Abstract
With the significant growth in electronic education materials such as syllabus documents and lecture notes available on the Internet and intranets, there is a need for developing structured central repositories of such materials to allow both educators and learners to easily share, search and access them. This paper reports on our on-going work to develop a national repository for course syllabi in Ireland. In specific, it describes a prototype syllabus repository system for higher education in Ireland that has been developed by utilising a number of information extraction and document classification techniques, including a new fully unsupervised document classification method that uses a web search engine for automatic collection of training set for the classification algorithm. Preliminary experimental results for evaluating the system’s performance are presented and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Marcis, J.G., Carr, D.: A note on student views regarding the course syllabus. Atlantic Economic Journal 31(1), 115 (2003), http://dx.doi.org/10.1007/BF02298467
Embley, D.W., Hurst, M., Lopresti, D., Nagy, G.: Table-processing paradigms: a research survey. International Journal on Document Analysis and Recognition 8(2-3), 66–86 (2006), http://dx.doi.org/10.1007/s10032-006-0017-x
Mccallum, A.: Information extraction: distilling structured data from unstructured text. Queue 3(9), 48–57 (2005), http://dx.doi.org/10.1145/1105664.1105679
Yu, X., Tungare, M., Fan, W., Yuan, Y., Pérez-Quiñones, M., Fox, E.A., Cameron, W., Cassel, L.: Using Automatic Metadata Extraction to Build a Structured Syllabus Repository. In: Proceedings of the 10th International Conference on Asian Digital Libraries (ICADL 2007), Ha Noi, Vietnam (December 2007), http://manas.tungare.name/publications/yu_2007_using
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, US (July 2002), http://gate.ac.uk/gate/doc/papers.html
Choi, F.: Advances in domain independent linear text segmentation. In: Proceedings of the first conference on North American chapter of the Association for Computational Linguistics (NAACL 2000), Seattle, USA (April 2000), http://arxiv.org/abs/cs/0003083
Thompson, C., Smarr, J., Nguyen, H., Manning, C.D.: Finding Educational Resources on the Web: Exploiting Automatic Extraction of Metadata. In: Proceedings of the ECML Workshop on Adaptive Text Extraction and Mining, Cavtat-Dubrovnik, Croatia (September 2003), http://nlp.stanford.edu/publications.shtml
Matsunaga, Y., Yamada, S., Ito, E., Hirokawa, S.: A Web Syllabus Crawler and its Efficiency Evaluation. In: Proceedings of the International Symposium on Information Science and Electrical Engineering 2003 (ISEE 2003), Fukuoka, Japan (November 2003), https://qir.kyushu-u.ac.jp/dspace/bitstream/2324/2972/1/2003_d_2.pdf
de Assis, G., Laender, A., Gonçalves, M., da Silva, A.: Exploiting Genre in Focused Crawling. In: String Processing and Information Retrieval, pp. 62–73. Springer, Heidelberg (2007)
Xiaoyan, Y., Manas, T., Weiguo, F., Manuel, P.-Q., Edward, A.F., William, C., GuoFang, T., Lillian, C.: Automatic syllabus classification. In: Proceedings of the ACM IEEE Joint Conference on Digital Libraries, Vancouver, BC, Canada (June 2007), http://doi.acm.org/10.1145/1255175.1255265
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)
OpenOffice.org 2.0 (sponsored by Sun Microsystems Inc., released under the open source LGPL licence, 2007), http://www.openoffice.org/
Xpdf 3.02 (Glyph & Cog, LLC., Released under the open source GPL licence, 2007) http://www.foolabs.com/xpdf/
Steward, S.: Pdftk 1.12 - the PDF Toolkit (sponsored by AccessPDF, Released under the open source GPL licence, 2004), http://www.accesspdf.com/pdftk/index.html
International Standard Classification of Education - 1997 version (ISCED 1997) (UNESCO, 2006) [cited 2007 December], http://www.uis.unesco.org/ev.php?ID=3813_201&ID2=DO_TOPIC
McCallum, A., Nigam, K.: A comparison of event models for Naive Bayes text classification. In: Proceedings of the AAAI 1998 Workshop on Learning for Text Categorization, Wisconsin, USA (1998), http://www.cs.umass.edu/~mccallum/papers/multinomial-aaai98w.ps
Joachims, T.: A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, TN, USA. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Seeger, M.: Learning with labeled and unlabeled data. Technical report, Institute for Adaptive and Neural Computation, University of Edinburgh (2000), http://www.kyb.tuebingen.mpg.de/bs/people/seeger/papers/review.pdf
Yahoo! Search Web Services Software Development Kit (Yahoo! Inc (2007), http://developer.yahoo.com/search/
Appelt, D.E., Israel, D.: Introduction to Information Extraction Technology. In: Proceedings of the 16th international joint conference on artificial Intelligence (IJCAI 1999), Stockholm, Sweden (August 2, 1999), http://www.ai.sri.com/~appelt/ie-tutorial/IJCAI99.pdf
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Joorabchi, A., Mahdi, A.E. (2008). Development of a National Syllabus Repository for Higher Education in Ireland. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2008. Lecture Notes in Computer Science, vol 5173. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87599-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-87599-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87598-7
Online ISBN: 978-3-540-87599-4
eBook Packages: Computer ScienceComputer Science (R0)