Skip to main content
Log in

Knowledge discovery in digital libraries of electronic theses and dissertations: an NDLTD case study

  • Regular Paper
  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Many scholarly writings today are available in electronic formats. With universities around the world choosing to make digital versions of their dissertations, theses, project reports, and related files and data sets available online, an overwhelming amount of information is becoming available on almost any particular topic. How will users decide which dissertation, or subsection of a dissertation, to read to get the required information on a particular topic? What kind of services can such digital libraries provide to make knowledge discovery easier? In this paper, we investigate these issues, using as a case study the Networked Digital Library of Theses and Dissertations (NDLTD), a rapidly growing collection that already has about 800,000 Electronic Theses and Dissertations (ETDs) from universities around the world. We propose the design for a scalable, Web Services based tool KDWebS (Knowledge Discovery System based on Web Services), to facilitate automated knowledge discovery in NDLTD. We also provide some preliminary proof of concept results to demonstrate the efficacy of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Fox, E., Hall, R., Kipp, N.: NDLTD: preparing the next generation of scholars for the information age. The New Review of Information Networking (NRIN), pp. 59–76 (1997)

  2. Sparck-Jones, K.: Discourse modeling for automatic summarizing. University of Cambridge Computer Laboratory, Cambridge, Technical Report 290 (1993)

  3. Amini, M., Gallinari, P.: The use of unlabeled data to improve supervised learning for text summarization. In: Proceedings of SIGIR’02, Tampere, Finland, pp. 105–112 (2002)

  4. Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR, Melbourne, pp. 335–336 (1998)

  5. Strzalkowski, T., Wang, J., Wise, B.: A robust practical text summarization system. In: Proceedings of the Fifteenth National Conference on AI, pp. 26–30 (1998)

  6. McDonald, D., Chen, H.: Using sentence-selection heuristics to rank text segments in TXTTRACTOR. In: Proceedings of International Conference on Digital Libraries, Portland, OR, pp. 28–35 (2002)

  7. Novak J.D., Gowin D.B.: Learning How To Learn. Cambridge University Press, Cambridge, UK (1984)

    Google Scholar 

  8. Sowa J.F.: Semantic networks. In: Shapiro, S.C. (eds) Encyclopedia of Artificial Intelligence, Wiley, New York (1992)

    Google Scholar 

  9. Malone J., Dekkers J.: The concept map as an aid to instruction in science and mathematics. Sch. Sci. Math, 84(3), 220–231 (1984)

    Article  Google Scholar 

  10. Mintzes J.J., Wandersee J.H., Novak J.D.: Assessing science understanding: a human constructivist view. Academic Press, San Diego (2000)

    Google Scholar 

  11. Wallace J., Mintzes J.J.: The concept map as a research tool: exploring conceptual change in biology. J. Res. Sci. Teach. 27(10), 1033–1052 (1990)

    Google Scholar 

  12. Cañas, A.J., Novak, J.D.: Facilitating the adoption of concept mapping using cmaptools to enhance meaningful learning. Knowledge Cartography: Software Tools and Mapping Techniques (2008, to appear)

  13. Felder R.M., Spurlin J.: Applications, reliability and validity of the index of learning styles. Int. J. Eng. Educ. 21(1), 103–112 (2005)

    Google Scholar 

  14. Zywno, M.S., Waalen, J.K.: The effect of hypermedia instruction on achievement and attitudes of students with different learning styles. In: Proceedings of American Society for Engineering Education Conference, Albuquerque (2001)

  15. Gaines, B.R., Shaw, M.L.G.: Using knowledge acquisition and representation tools to support scientific communities. In: Proceedings of AAAI’94: Proceedings of the Twelfth National Conference on Artificial Intelligence, Menlo Park, California, pp. 707–714 (1994)

  16. Sparck-Jones K.: Automatic Keyword Classification for Information Retrieval. Archon Books, London (1971)

    Google Scholar 

  17. Callon M., Law J.: Mapping the Dynamics of Science and Technology. MacMillan, Basingstoke, UK (1986)

    Google Scholar 

  18. Rajaraman, K., Tan, A.: Knowledge discovery from texts: a concept frame graph approach. In: Proceedings of International Conference on Information and Knowledge Management, McLean, VA, pp. 669–671 (2002)

  19. W3C Web Services Activity Homepage. http://www.w3.org/2002/ws/ (2008)

  20. Petinot, Y., Giles, C.L., Bhatnagar, V., Teregowda, P.B., Han, H., Councill, I.G.: A Service-Oriented Architecture for Digital Libraries. In: Proceedings of the 2nd International Conference on Service oriented computing, pp. 263–268 (2004)

  21. Lagoze, C., Van de Sompel, H.: The open archives initiative: building a low-barrier interoperability framework. In: Proceedings of the first ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 54–62 (2001)

  22. Lynch, C.A.: Metadata Harvesting and the Open Archives Initiative. ARL: A Bimonthly Report, no. 217, pp. 19 (2001)

  23. Ross, M., Pinto, H., Pennachin, C., Goertzel, B., Looks, M., Senna, A., Silva, W.: INLINK: an interactive knowledge-entry and querying tool. Presented at Human Language Technology conference—North American chapter of the Association for Computational Linguistics, New York (2006)

  24. Richardson, R., Goertzel, B., Pinto, H., Fox, E.A.: Automatic creation and translation of concept maps for computer science-related theses and dissertations. In: Proceedings of 2nd Concept Mapping Conference 2006, San Jose, Costa Rica, pp. 160–163 (2006)

  25. Cassel, L.N.: The Ontology Project. http://what.csc.villanova.edu/twiki/bin/view/Main/OntologyProject (2006)

  26. Sleator, D., Temperley, D.: Parsing English with a link grammar. In: Proceedings of the the Third International Workshop on Parsing Technologies, Tilburg, Netherlands & Durbuy, Belgium (1993)

  27. Hobbs J.: Pronoun resolution. Lingua 44, 339–352 (1978)

    Article  Google Scholar 

  28. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02), Philadelphia (2002)

  29. Bontcheva K., Tablan V., Maynard D., Cunningham H.: Evolving GATE to meet new challenges in language engineering. Nat. Lang. Eng. 10(3/4), 349–373 (2004)

    Article  Google Scholar 

  30. IBM: Unstructured Information Management Architecture (UIMA). http://www.alphaworks.ibm.com/tech/uima (2006)

  31. Cañas, A.J., Hill, G., Carff, R., Suri, N., Lott, J., Gmez, G., Eskridge, T.C., Arroyo, M., Carvajal, R.: CMapTools: a knowledge modeling and sharing environment. In: First International Conference on Concept Mapping, Pamplona, Spain, pp. 16–24 (2004)

  32. Liu, Y., Bai, K., Mitra, P., Giles, C.L.: TableSeer: automatic table metadata extraction and searching in digital libraries. In: Proceedings of Joint Conference on Digital Libraries, Vancouver, BC, pp. 91–100 (2007)

  33. Richardson, R., Fox, E.A.: Using bilingual ETD collections to mine phrase translations. In: Proceedings of Joint Conference on Digital Libraries, Vancouver, British Columbia, Canada, pp. 352–353 (2007)

  34. Richardson, R., Fox, E.A.: Using concept maps in NDLTD as a cross-language summarization tool for computing-related ETDs. In: Proceedings of 10th International Symposium on Electronic Theses and Dissertations, Uppsala, Sweden, pp. 1–8 (2007)

  35. Ribbens, C.J., Varadarajan, S., Chinnusamy, M., Swaminathan, G.: Balancing computational science and computer science research on a terascale computing facility. In: Proceedings of International Conference on Computational Science, pp. 60–67 (2005)

  36. Goncalves, M.A., France, R.K., Fox, E.A., Doszkocs, T.E.: MARIAN: searching and querying across heterogeneous federated digital libraries. In: Proceedings of First DELOS Workshop on Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland (2000)

  37. Smith M., Barton M., Bass M., Branschofsky M., McClellan G., Stuve D., Tansley R., Walker J.H.: DSpace: an open source dynamic digital repository. D-Lib Magazine 9, 1 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward A. Fox.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Richardson, W.R., Srinivasan, V. & Fox, E.A. Knowledge discovery in digital libraries of electronic theses and dissertations: an NDLTD case study. Int J Digit Libr 9, 163–171 (2008). https://doi.org/10.1007/s00799-008-0046-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-008-0046-9

Keywords

Navigation