Skip to main content
Log in

A Framework for Effective Annotation of Information from Closed Captions Using Ontologies

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

To improve the accuracy in terms of precision and recall of an audio information retrieval system we have created a domain-specific ontology (a collection of key concepts and their interrelationships), as well as a novel, pruning algorithm. Given the shortcomings of keyword-based techniques, we have opted to employ a concept-based technique utilizing this ontology. Achieving high precision and high recall is the key problem in the retrieval of audio information. In traditional approaches, high recall is typically achieved at the expense of low precision, and vice versa. Through the use of a domain-specific ontology appropriate concepts can be identified during metadata generation (description of audio) or query generation, thus improving precision.

When irrelevant concepts are associated with queries or documents there is a loss of precision. On the other side of the coin, if relevant concepts are discarded, a loss of recall will ensue. In conjunction with the use of a domain specific ontology we have thus proposed a novel, automatic pruning algorithm which prunes as many irrelevant concepts as possible during any case of description and identification of documents, and query generation. To improve recall, A controlled and correct query expansion mechanism is proposed for the improvement of recall, thus guaranteeing that precision will not be lost.

We have constructed a demonstration prototype, and experimentally and analytically we have shown that our model, compared to keyword search, achieves a significantly higher degree of precision and recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aslan, G. and McLeod, D. (1999). Semantic Heterogeneity Resolution in Federated Database by Metadata Implantation and Stepwise Evolution, The VLDB Journal, the International Journal on Very Large Databases, 18(2), 120–132.

    Google Scholar 

  • Baeza, R. and Neto, B. (1999). Modern Information Retrieval. New York, Addison Wesley, Reading, MA: ACM Press.

    Google Scholar 

  • Bunge, M. (1977). Treatise on Basic Philosophy, Ontology I: The Furniture of the World, vol. 3. Boston: Reidel Publishing Co.

    Google Scholar 

  • Gibbs, S., Breitender, C., and Tsichritzis, D. (1994). Data Modeling of Time based Media.

  • Gonzalo, J., Verdejo, F. Chugur, I., and Cigarran, J. (1998). Indexing with WordNet Synsets can Improve Text Retrieval. In Proc. of the Coling-ACL’98 Workshop: Usage of WordNet in Natural Language Processing Systems (pp. 38–44).

  • Gruber, T.R. (1993). A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition. An International Journal of Knowledge Acquisition for Knowledge-based Systems, 5(2), 199–220.

    Google Scholar 

  • Guarino, N. Masolo, C. and Vetere, G. (1999). OntoSeek: Content-based Access to the Web. IEEE Intelligent Systems, 14(3), 70–80.

    Article  Google Scholar 

  • Hauptmann, A. G. (1995). Speech Recognition in the Informedia Digital Video Library: Uses and Limitations. In Proc. of the Seventh IEEE International Conference on Tools with AI, Washington, DC.

  • Khan, L. (2000). Ontology-based Information Selection. Ph.D. Dissertation, Department of Computer Science, University of Southern California.

  • Khan, L. and McLeod, D. (2000). Audio Structuring and Personalized Retrieval Using Ontologies. In Proc. of IEEE Advances in Digital Libraries. Library of Congress (pp. 116–126). Bethesda, MD.

  • Khan, L. and McLeod, D. (2000). Effective Retrieval of Audio Information from Annotated Text Using Ontologies. In Proc. of ACM SIGKDD Workshop on Multimedia Data Mining (pp. 37–45). Boston, MA.

  • Khan, L., McLeod, D., and Hovy, E.(2004). Retrieval Effectiveness of Ontology-based Model for Information Selection the VLDB Journal. The International Journal on Very Large Databases, 13(1), 71–85.

    Article  Google Scholar 

  • Lenat, D.B.(1995). Cyc: A Large-scale investment in Knowledge Infrastructure. Communications of the ACM, 38(11), 33–38.

    Article  Google Scholar 

  • Miller, G. (1995). WordNet: A Lexical Database for English. Communications of the ACM, 38(11), 33–38.

    Article  Google Scholar 

  • Mitra, P., Kersten, M., and Wiederhold, G. (2000). Graph-Oriented Model for Articulation of Ontology Interdependencies. In Proc. of the 7th International Conference on Extending Database Technology, EDBT.

  • OWL Web Ontology Language Overview, http://www.w3.org/TR/2004/REC-owl-features-20040210/

  • Peat, H.J. and Willett, P. (1991). The Limitations of Term Co-occurrence Data for Query Expansion in Document Retrieval Systems. Journal of ASIS, 42(5).

  • Porter, M.F. (1997). An Algorithm for Suffix Stripping Program. In J.S. Karen and P. Willet (Eds.), Readings in Information Retrieval. San Francisco: Morgan Kaufmann. In: Proc. of ACM SIGMOD (pp. 91–102). Minneapolis, USA.

  • RDF. Resource Description Framework, http://www.w3.org/RDF/

  • Smeaton, A.F. and Rijsbergen, V. (1993). The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System. The Computer Journal, 26(3), 239–246.

    Article  Google Scholar 

  • Swartout, B., Patil, R., Knight, K., and Ross, T. (1996). Toward Distributed Use of Large-Scale Ontologies. In Proc. of the Tenth Workshop on Knowledge Acquisition for Knowledge-Based Systems. Banff, Canada.

  • Voorhees, E. (1994). Query Expansion Using Lexical-Semantic Relations. In Proc. of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 61–69).

  • Wilcox L.D. and Bush, M.A. (1992). Training and Search Algorithms for an Interactive Wordspotting System. In Proc. of ICASSP (pp. 97–100). San Francisco, vol. 2.

  • Woods, W. (1999). Conceptual Indexing: A Better Way to Organize Knowledge. Technical Report of Sun Microsystems.

  • XML (1999). Using XML: Ontology and Conceptual Knowledge Markup Languages (1999). http://www.oasis-open.org/cover/xml.html.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Latifur Khan.

Additional information

This research has been funded [or funded in part] by the Integrated Media Systems Center, a National Science Foundation Engineering Research Center, Cooperative Agreement No. EEC-9529152.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, L., McLeod, D. & Hovy, E. A Framework for Effective Annotation of Information from Closed Captions Using Ontologies. J Intell Inf Syst 25, 181–205 (2005). https://doi.org/10.1007/s10844-005-0188-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-005-0188-9

Keywords

Navigation