Authoring Technical Documents for Effective Retrieval

Butters, Jonathan; Ciravegna, Fabio

doi:10.1007/978-3-642-16438-5_20

Authoring Technical Documents for Effective Retrieval

Jonathan Butters²¹ &
Fabio Ciravegna²¹

Conference paper

1390 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6317))

Abstract

In this paper we outline the design considerations and application of a methodology to author technical documents in order to improve retrieval. Our approach is firmly aimed at large organizations where variations in terminology at personal, national and international scales often impede retrieval of relevant knowledge. We first present the difficulties in performing entity extraction in technical domains and the role variation in terminology has in the information extraction task before outlining and evaluating a methodology that allows for effective retrieval.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Just-in-Time Delivery Comes to Knowledge Management. Harvard Business Review 80(7) (July 2002)
Google Scholar
Kittredge, R., Lehrberger, J.: Sublanguage: Studies of Language in Restricted Semantic Domains. deGruyter (1982)
Google Scholar
Engelson, S.P., Dagan, I.: Minimizing manual annotation cost in supervised training from corpora. In: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics (1996)
Google Scholar
Wilson, T., Wiebe, J., Hoffmann, P., et al.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (2005)
Google Scholar
Schlueter, S., Dong, Q., Brendel, V.: GeneSeqer@PlantGDB: gene structure prediction in plant genomes. Nucleic Acids Research 31(13), 3597–3600 (2003)
Article Google Scholar
Grishman, R.: Adaptive Information Extraction and Sublanguage Analysis. In: Proceedings of IJCAI Workshop on Adaptive Text Extraction and Mining, pp. 77–79 (2001)
Google Scholar
Ciravegna, F., Dingli, A., Petrelli, D., Wilks, Y.: User-System Cooperation in Document Annotation based on Information Extraction. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, p. 122. Springer, Heidelberg (2002)
Chapter Google Scholar
Ciravegna, F.: Adaptiveinformationextractionfromtextbyruleinductionandgeneralisation. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, IJCAI 2001 (2001)
Google Scholar
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of ‘the 42nd Annual Meeting of the Association for Computational Linguistics, ACL 2004 (2004)
Google Scholar
Zhang, Z., Iria, J.: A Novel Approach to Automatic Gazetteer Generation using Wikipedia. In: Proceedings of the ACL 2009 Workshop on Collaboratively (2009)
Google Scholar
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Moore, R.C., Bilmes, J.A., Chu-Carroll, J., Sanderson, M. (eds.) HLT-NAACL. ACL (2006)
Google Scholar
Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: AAAI, pp. 1419–1424. AAAI Press, Menlo Park (2006)
Google Scholar
Toraland, A., Munoz, R.: A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia. In: Workshop on New Text, 11th Conference of the European Chapter of the Association for Computational Linguistics (2006)
Google Scholar
Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: ACL (2006)
Google Scholar
Feldman, R., Rosenfeld, B., Soderland, S., Etzioni, O.: Self-supervised relation extraction from the web. In: ISMIS, pp. 755–764 (2006)
Google Scholar
Agichtein, E.: Confidence estimation methods for partially supervised relation extraction. In: SDM 2006 (2006)
Google Scholar
Chen, J., Ji, D.-H., Tan, C.L., Niu, Z.-Y.: Semi-supervised relation extraction with label propagation. In: HLT-NAACL (2006)
Google Scholar
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: EMNLP 2003 (2003)
Google Scholar
Bhagdev, R., Chakravarthy, A., Chapman, S., Ciravegna, F., Lanfranchi, V.: Creating and Using Organisational Semantic Webs in Large Networked Organisations. In: Proceedings of the 7th International Semantic Web Conference, Karlsruhe, Germany (October 2008)
Google Scholar
Liu, H., Lieberman, H., Selker, T.: GOOSE: A Goal-Oriented Search Engine With Commonsense. In: De Bra, P., Brusilovsky, P., Conejo, R. (eds.) AH 2002. LNCS, vol. 2347, p. 253. Springer, Heidelberg (2002)
Chapter Google Scholar
Giunchiglia, F., Kharkevich, U., Zaihrayeu, I.: Concept search. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 429–444. Springer, Heidelberg (2009)
Chapter Google Scholar
Sclano, F., Velardi, P.: Termextractor: a web application to learn the shared terminology of emergent web communities. In: Proceedings of the 3rd International Conference on Interoperability for Enterprise Software andApplications, I-ESA 2007 (2007)
Google Scholar
Frantzi, K.T., Ananiadou, S.: The c/nc value domain independent method for multi-word term extraction. Journal of Natural Language Processing utilization in the Information Search and Delivery System for IBM Technical Support. IBM Systems Journal 43(3), 546–563 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Jonathan Butters & Fabio Ciravegna

Authors

Jonathan Butters
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Ciravegna
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Cognitive Interaction Technology Excellence Center (CITEC), Universität Bielefeld, Universitätsstraße 21-23, 33615, Bielefeld, Germany
Philipp Cimiano
IST/DEI, INESC-ID, Rua Alves Redol 9, 1000-029, Lisboa, Portugal
H. Sofia Pinto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Butters, J., Ciravegna, F. (2010). Authoring Technical Documents for Effective Retrieval. In: Cimiano, P., Pinto, H.S. (eds) Knowledge Engineering and Management by the Masses. EKAW 2010. Lecture Notes in Computer Science(), vol 6317. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16438-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-16438-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16437-8
Online ISBN: 978-3-642-16438-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics