skip to main content
10.1145/1255175.1255230acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

A dynamic ontology for a dynamic reference work

Published: 18 June 2007 Publication History

Abstract

The successful deployment of digital technologies by humanities scholars presents computer scientists with a number of unique scientific and technological challenges. The task seems particularly daunting because issues in the humanities are presented in abstract language demanding the kind of subtle interpretation often thought to be beyond the scope of artificial intelligence, and humanities scholars themselves often disagree about the structure of their disciplines. The future of humanities computing depends on having tools for automatically discovering complex semantic relationships among different parts of a corpus. Digital library tools for the humanities will need to be capable of dynamically tracking the introduction of new ideas and interpretations and applying them to older texts in ways that support the needs of scholars and students.
This paper describes the design of new algorithms and the adjustment of existing algorithms to support the automated and semi-automated management of domain-rich metadata for an established digital humanities project, the Stanford Encyclopedia of Philosophy. Our approach starts with a "hand-built" formal ontology that is modified and extended by a combination of automated and semi-automated methods, thus becoming a "dynamic ontology". We assess the suitability of current information retrieval and information extraction methods for the task of automatically maintaining the ontology. We describe a novel measure of term-relatedness that appears to be particularly helpful for predicting hierarchical relationships in the ontology. We believe that our project makes a further contribution to information science by being the first to harness the collaboration inherent in a expert-maintained dynamic reference work to the task of maintaining and verifying a formal ontology. We place special emphasis on the task of bringing domain expertise to bear on all phases of the development and deployment of the system, from the initial design of the software and ontology to its dynamic use in a fully operational digital reference work.

References

[1]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In J. B. Bocca, M. Jarke, and C. Zaniolo, editors, Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499. Morgan Kaufmann, 12-15 1994.
[2]
R. A. Baeza-Yates, R. Baeza-Yates, and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.
[3]
V. Batagelj and A. Mrvar. Pajek - program for large network analysis. In Connections, 21:47--57, 1998.
[4]
C. Brewster, F. Ciravegna, and Y. Wilks. Background and foreground knowledge in dynamic ontology construction. In Proceedings of the Semantic Web Workshop, Toronto, August 2003. SIGIR, 2003.
[5]
S. Caraballo and E. Charniak. Determining the specificity of nouns from text. In Proceedings of the joint SIGDAT conference on empirical methods in natural language processing (EMNLP) and very large corpora (VLC), 63--70,1999.
[6]
P. Cimiano, A. Hotho, and S. Staab. Learning concept hierarchies from text corpora using formal concept analysis. In Journal of Artificial Intelligence Research 24, 2005.
[7]
P. Cimiano, A. Pivk, L. Schmidt-Thieme, and S. Staab. Learning taxonomic relations from heterogeneous sources of evidence. In Proceedings of the ECAI 2004 Ontology Learning and Population Workshop, 2004.
[8]
M. Crampes and S. Ranwez. Ontology-supported and ontology-driven conceptual navigation on the world wideweb. In Proceedings of the eleventh ACM on Hypertext and hypermedia, 2000.
[9]
I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. In Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.
[10]
P. D'Iorio. Cognitive models of hypernietzsche: Dynamic ontology and hyper-learning. Jahrbuch für Computerphilologie, 2003.
[11]
T. Eiter, T. Lukasiewicz, R. Schindlauer, and H. Tompits. Combining answer set programming with description logics for the semantic web. In Proceedings of the International Conference of Knowledge Representation and Reasoning (KR'04), 2004.
[12]
C. Fluit, M. Sabou, and F. van Harmelen. Ontology-based information visualisation. In V. Geroimenko, editor, Visualising the Semantic Web (2nd Edition). Springer Verlag, 2005.
[13]
M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In R. A. Kowalski and K. Bowen, editors, Proceedings of the Fifth International Conference on Logic Programming, pages 1070--1080, Cambridge, Massachusetts, 1988. The MIT Press.
[14]
L. Getoor and C. P. Diehl. Introduction to the special issue on link mining. SIGKDD Explor. Newsl., 7(2), 2005.
[15]
L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. In Journal of Machine Learning Research, 3:679--707, 2002.
[16]
M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics, pages 539--545, Morristown, NJ, USA, 1992. Association for Computational Linguistics.
[17]
S. Heymans and D. Vermeir. Integrating semantic web reasoning and answer set programming. In Proceedings of the 2nd Intl. ASP'03 Workshop, Messina, Italy, pages 194--208, 2003.
[18]
M. Jones and D. Mewhort. Representing word meaning and order information in a composite holographic lexicon. Psychological Review, 114:1--37, 2007.
[19]
D. Jurafsky and J. H. Martin. Speech and Language Processing. Prentice-Hall, 2000.
[20]
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. 18th International Conf. on Machine Learning, pages 282--289. Morgan Kaufmann, San Francisco, CA, 2001.
[21]
C. D. Manning and H. Schuetze. Foundations of Statistical Natural Language Processing. The MIT Press, June 1999.
[22]
T. M. Mitchell. Machine Learning. McGraw-Hill, New York, 1997.
[23]
P. Nakov and M. A. Hearst. Using verbs to characterize noun-noun relations. In AIMSA, pages 233--244, 2006.
[24]
M. Pasin and E. Motta. An ontology for the description and navigation through philosophical resources. In Proceedings of the European Conference on Philosophy and Computing, ECAP-06, Trondheim, Norway, June 2006.
[25]
P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11:95--130, 1999.
[26]
E. Riloff. Information extraction as a stepping stone toward story understanding. MIT Press, Cambridge, MA, USA,1999.
[27]
G. Salton. Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1989.
[28]
M. Sanderson and B. Croft. Deriving concept hierarchies from text. In SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 206--213, New York, NY, USA, 1999. ACM Press.
[29]
K. Seymore, A. McCallum, and R. Rosenfeld. Learning hidden markov model structure for information extraction. In AAAI Workshop on Machine Learning for Information Extraction, 1999.
[30]
R. M. Shiffrin and K. Börner. Mapping knowledge domains. PNAS, 101:5183--5185, 2004.
[31]
P. Smyth and R. Goodman. An information theoretic approach to rule induction from databases. In IEEE Transactions on Knowledge and Data Engineering, vol. 04, no. 4, pp. 301--316, 1992.
[32]
S. Staab, C. Braun, I. Bruder, A. Dft, A. Heuer, M. Klettke, G. Neumann, B. Prager, J. Pretzel, H. -P. Schnurr, R. Studer, H. Uszkoreit, and B. Wrenger. GETESS - searching the web exploiting german texts. In Cooperative Information Agents, pages 113--124, 1999.
[33]
M. Steinbach, G. Karypis, and V. Kumar. A comparison of document clustering techniques. In KDD Workshop on Text Mining, 2000.
[34]
P. Velardi, R. Navigli, A. Cuchiarelli, and F. Neri. Evaluation of ontolearn, a methodology for automatic population of domain ontologies. In P. Buitelaar, P. Cimiano, and B. Magnini, editors, Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '07: Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
June 2007
534 pages
ISBN:9781595936448
DOI:10.1145/1255175
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. digital humanities
  2. dynamic ontology
  3. formal ontology
  4. information extraction
  5. information retrieval
  6. link mining
  7. metadata

Qualifiers

  • Article

Conference

JCDL07
JCDL07: Joint Conference on Digital Libraries
June 18 - 23, 2007
BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Deep Web Information Retrieval ProcessThe Dark Web10.4018/978-1-5225-3163-0.ch007(114-137)Online publication date: 2018
  • (2014)"Supertagger" behavior in building folksonomiesProceedings of the 2014 ACM conference on Web science10.1145/2615569.2615686(129-138)Online publication date: 23-Jun-2014
  • (2013)Cross-Cutting Categorization Schemes in the Digital HumanitiesIsis10.1086/673276104:3(573-583)Online publication date: Sep-2013
  • (2013)Evaluating Dynamic OntologiesKnowledge Discovery, Knowledge Engineering and Knowledge Management10.1007/978-3-642-29764-9_18(258-275)Online publication date: 2013
  • (2013)Knowledge Engineering via Human ComputationHandbook of Human Computation10.1007/978-1-4614-8806-4_13(131-151)Online publication date: 20-Nov-2013
  • (2012)Deep Web Information Retrieval ProcessModels for Capitalizing on Web Engineering Advancements10.4018/978-1-4666-0023-2.ch005(75-96)Online publication date: 2012
  • (2011)Ontology based information extraction from textKnowledge-driven multimedia information extraction and ontology evolution10.5555/2001069.2001073(89-109)Online publication date: 1-Jan-2011
  • (2011)Recent Developments in Computing and PhilosophyJournal for General Philosophy of Science10.1007/s10838-011-9164-y42:2(385-397)Online publication date: 25-Aug-2011
  • (2011)Ontology Based Information Extraction from TextKnowledge-Driven Multimedia Information Extraction and Ontology Evolution10.1007/978-3-642-20795-2_4(89-109)Online publication date: 2011
  • (2010)Thesaurus extension using web search enginesProceedings of the role of digital libraries in a time of global change, and 12th international conference on Asia-Pacific digital libraries10.5555/1875689.1875720(198-207)Online publication date: 21-Jun-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media