Abstract
The majority of knowledge on the Web is encoded in unstructured text and is not linked to formalized knowledge, such as ontologies and rules. A potential solution to this problem is to acquire this knowledge through natural language processing and text mining methods. Prior work has focused on automatically extracting RDF- or OWL-based ontologies from text; however, the type of knowledge acquired is generally restricted to simple term hierarchies. This paper presents a general-purpose framework for acquiring more complex relationships from text and then encoding this knowledge as rules. Our approach starts with existing domain knowledge in the form of OWL ontologies and Semantic Web Rule Language (SWRL) rules and applies natural language processing and text matching techniques to deduce classes and properties. It then captures deductive knowledge in the form of new rules. We have evaluated our framework by applying it to web-based text on car rental requirements. We show that our approach can automatically and accurately generate rules for requirements of car rental companies not in the knowledge base. Our framework thus rapidly acquires complex knowledge from free text sources. We are expanding it to handle richer domains, such as medical science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yangarber, R., Grishman, R., Tapanainen, P., Huttunen, S.: Automatic Acquisition of Domain Knowledge for Information Extraction. In: Proceedings of COLING 2000: The 18th International Conference on Computational Linguistics, Saarbrücken, Germany (2000)
Maedche, A., Staab, S.: Ontology learning for the Semantic Web. IEEE Intell. Sys. 16(2) (2001)
Alani, H., Kim, S., Millard, D.E., Weal, M.J., Hall, W., Lewis, P.H., Shadbolt, N.R.: Automatic Ontology-Based Knowledge Extraction from Web Documents. IEEE Intell. Sys. 18(1), 14–21 (2003)
Manine, A.P., Alphonse, E., Bessières, P.: Learning ontological rules to extract multiple relations of genic interactions from text. Int. J. Med. Informat. 78(12), e31–e38 (2009)
Miller, G.A.: WordNet: A Lexical Database for English. Com. ACM 38(11), 39–41 (1995)
de Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating Typed Dependency Parses from Phrase Structure Parses. In: Proceedings of 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy (2006)
Liu, B., Hsu, W., Ma, Y.: Integrating Classification and Association Rule Mining. In: Knowledge Discovery in Databases (1998)
Held, C.M., Heiss, J.E., Estevez, P.A., Perez, C.A., Garrido, M., Algarin, C., Peirano, P.: Extracting Fuzzy Rules From Polysomnographic Recordings for Infant Sleep Classification. IEEE Trans. Biomed. Eng. 53, 1954–1962 (2006)
Madkour, A., Darwish, K., Hassan, H., Hassan, A., Emam, O.: BioNoculars: Extracting Protein-Protein Interactions from Biomedical Text. In: BioNLP, Prague, Czech Republic (2007)
Shnarch, E., Barak, L., Dagan, I.: Extracting Lexical Reference Rules from Wikipedia. In: Proceedings of the 47th Annual Meeting of the ACL, Suntec, Singapore (2009)
Xu, F., Kurz D., Piskorski J., Schmeier S.: A Domain Adaptive Approach to Automatic Acquisition of Domain Relevant Terms and Their Relations with Bootstrapping. In: Proc. Third Int’l Conf. Language Resources and Evaluation (LREC 2002) (2002)
Muller, H.M., Kenny, E.E., Sternberg, P.W.: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2, e309 (2004)
Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI 1999), pp. 474–479 (1999)
Crow, L., Shadbolt, N.: Extracting Focused Knowledge from the Semantic Web. Int. J. Hum. Comput. Stud. 54, 155–184 (2001)
Buitelaar, P., Olejnik, D., Sintek, M.: A Protégé plug-in for ontology extraction from text based on linguistic analysis. In: Proceedings of the International Semantic Web Conference, ISWC (2003)
Kang, J., Lee, J.K.: Rule Identification from Web Pages by the XRML Approach. Decision Support Systems 41(1), 205–227 (2005)
Duboue, P.A., McKeown, K.R.: Statistical acquisition of content selection rules for natural language generation. In: Proceedings of EMNLP, pp. 121–128 (2003)
Park, S., Lee, J.K.: Rule identification using ontology while acquiring rules from Web pages. Int. J. Hum.-Comput. Stud. 65(7), 659–673 (2007)
Lee, J.K., Sohn, M.: Extensible Rule Markup Language - toward intelligent Web platform. Communications of the ACM 46, 59–64 (2003)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
California cities by population, http://en.wikipedia.org/wiki/List_of_California_cities_by_population
Avis information web page, http://www.avis.com/car-rental/content/render-faq.ac
Enterprise information web page, http://enterprise.custhelp.com/app/answers/detail/a_id/3061/session/L3NpZC9MZjFxTlNtaw%3D%3D/sno/0
Hassanpour, S., Das, A.K. Semantics-based Text Mining of Biomedical Concepts in Scientific Publications. Stanford Institute of Biomedical Informatics Research, Technical Report BMIR-2010-1421 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hassanpour, S., O’Connor, M.J., Das, A.K. (2011). A Framework for the Automatic Extraction of Rules from Online Text. In: Bassiliades, N., Governatori, G., Paschke, A. (eds) Rule-Based Reasoning, Programming, and Applications. RuleML 2011. Lecture Notes in Computer Science, vol 6826. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22546-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-22546-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22545-1
Online ISBN: 978-3-642-22546-8
eBook Packages: Computer ScienceComputer Science (R0)