Ontology Construction Based on Latent Topic Extraction in a Digital Library

Yeh, Jian-hua; Yang, Naomi

doi:10.1007/978-3-540-89533-6_10

Jian-hua Yeh⁴ &
Naomi Yang⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5362))

Included in the following conference series:

International Conference on Asian Digital Libraries

1269 Accesses
18 Citations
6 Altmetric

Abstract

This paper discusses the automatic ontology construction process in a digital library. Traditional automatic ontology construction uses hierarchical clustering to group similar terms, and the result hierarchy is usually not satisfactory for human’s recognition. Human-provided knowledge network presents strong semantic features, but this generation process is both labor-intensive and inconsistent under large scale scenario. The method proposed in this paper combines the statistical correction and latent topic extraction of textual data in a digital library, which produces a semantic-oriented and OWL-based ontology. The experimental document collection used here is the Chinese Recorder, which served as a link between the various missions that were part of the rise and heyday of the Western effort to Christianize the Far East. The ontology construction process is described and a final ontology in OWL format is shown in our result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yeh, J.-H., Sie, S.-h.: Towards automatic concept hierarchy generation for specific knowledge network. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 982–989. Springer, Heidelberg (2006)
Chapter Google Scholar
Chen, C.-c., Yeh, J.-H., Sie, S.-h.: Government ontology and thesaurus construction: A taiwanese experience. In: Fox, E.A., Neuhold, E.J., Premsmit, P., Wuwongse, V. (eds.) ICADL 2005, vol. 3815, pp. 263–272. Springer, Heidelberg (2005)
Chapter Google Scholar
Deborah, L., McGuinness, Harmelen, F.v.: OWL Web Ontology Language Overview. W3C Recommendation (February 2004), http://www.w3.org/TR/owl-features/
Noy, N.F., McGuinness, D.L.: Ontology Development 101: A Guide to Creating Your First Ontology (2001)
Google Scholar
The Chinese Recorder, Scholarly Resources, Inc, 1867-1941
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Englewood Cliffs (1988)
MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31, 264–323 (1999)
Article Google Scholar
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Proceedings of ICML 1997, 14th International Conference on Machine Learning (1997)
Google Scholar
Li, F., Yang, Y.: A loss function analysis for classification methods in text categorization. In: The Twentieth International Conference on Machine Learning (ICML 2003), pp. 472–479 (2003)
Google Scholar
Valdes-Perez, R.E., et al.: Demonstration of Hierarchical Document Clustering of Digital Library Retrieval Results. In: Joint Conference on Digital Libraries (JDCL 2001), Roanoke, VA, June 24-28 (2001)(presented as a demonstration)
Google Scholar
Yang, Y., Zhang, J., Kisiel, B.: A scalability analysis of classifiers in text categorization. In: ACM SIGIR 2003, pp. 96–103 (2003)
Google Scholar
Widyantoro, D., Ioerger, T.R., Yen, J.: An Incremental Approach to Building a Cluster Hierarchy. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM 2002 (2002)
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Article Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1), 177–196 (2001)
Article MathSciNet MATH Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3(5), 993–1022 (2003)
MATH Google Scholar
Girolami, M., Kaban, A.: On an equivalence between PLSI and LDA. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 433–434 (2003)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Google Scholar
Collins, M.: A new statistical parser based on bigram lexical dependencies. In: Proceedings of the 34th Annual Meeting of the Association of Computational Linguistics, Santa Cruz, CA, pp. 184–191 (1996)
Google Scholar
British National Corpus, http://www.natcorp.ox.ac.uk/
Lodwick, K.L.: The Chinese Recorder Index: a guide to Christian Missions in Asia, 1867–1941. Scholarly Resources Inc., Wilmington (1986)
Google Scholar
Noy, N.F., Fergerson, R.W., Musen, M.A.: The knowledge model of protégé-2000: Combining interoperability and flexibility. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 17–32. Springer, Heidelberg (2000)
Chapter Google Scholar
Yeh, J.-h., Sie, S.-h.: Common Ontology Generation with Partially Available Side Information through Similarity Propagation. In: Proceedings of the 2007 International Conference on Semantic Web and Web Services(SWWS 2007), Las Vegas, USA (June 2007)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, Aletheia University, Taiwan
Jian-hua Yeh
Graduate Institute of Library and Information Studies, National Taiwan Normal University, Taiwan
Naomi Yang

Authors

Jian-hua Yeh
View author publications
You can also search for this author in PubMed Google Scholar
Naomi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Swansea University, Singleton Park, SA2 8PP, Swansea, UK
George Buchanan
Computer Science Department, The University of Waikato, Hamilton, New Zealand
Masood Masoodian
Computer Science Department, University of Waikato, Hamilton, New Zealand
Sally Jo Cunningham

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yeh, Jh., Yang, N. (2008). Ontology Construction Based on Latent Topic Extraction in a Digital Library. In: Buchanan, G., Masoodian, M., Cunningham, S.J. (eds) Digital Libraries: Universal and Ubiquitous Access to Information. ICADL 2008. Lecture Notes in Computer Science, vol 5362. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89533-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-89533-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89532-9
Online ISBN: 978-3-540-89533-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics