Extraction of Structured Rules from Web Pages and Maintenance of Mutual Consistency: XRML Approach

Kang, Juyoung; Lee, Jae Kyu

doi:10.1007/978-3-540-39715-1_11

Juyoung Kang⁶ &
Jae Kyu Lee⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2876))

Included in the following conference series:

International Workshop on Rules and Rule Markup Languages for the Semantic Web

207 Accesses
4 Citations

Abstract

Web pages provide valuable knowledge for human comprehension in text, tables, and mathematical notations. However, the extraction and maintenance of structured rules from the Web pages are not easy tasks. To tackle these problems, we adopt the eXtensible Rule Markup Language framework. The RIML (Rule Identification Markup Language) and RSML (Rule Structure Markup Language) are two compliant representations in XRML for this purpose. RIML identifies the implicit rules in the Web pages possibly using multiple pages to make a rule or rule group. RSML specifies the complete rule structure to be processed by software agents or expert systems.

In this study, we cover the natural text, tables, and implicit numeric functions in the texts. In order to fulfill the research goal, we define the necessary tags for the rule extraction and maintenance in XRML. Typical ones include tags for rule grouping, tabular rules, numeric operators, and functions. The rule acquisition process consists of rule base design, rule identification with RIML, and rule structuring with RSML. The maintenance process for the revisions that may occur either in Web pages and structured rules is also described. The approach is demonstrated with the shipping cost comparison on the electronic book stores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Babowal, D., Joerg, W.: From Information to Knowledge: Introducing WebStract’s Knowledge Engineering Approach. In: Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering (1999)
Google Scholar
Chan, K., Low, B.T., Lam, W., Lam, K.P.: Extracting Causation Knowledge from Natural Language Texts. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 555–560. Springer, Heidelberg (2002)
Chapter Google Scholar
Cravan, M., DiPasco, D., McCallum, A., Mitchell, T., Nigamm, K., Quek, C.Y.: Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence 118(1-2), 69–113 (1999)
Article Google Scholar
Crow, L.R., Shadbolt, N.R.: Extracting focused knowledge from the semantic web. International Journal of Human-Computer Studies 54, 155–184 (2001)
Article MATH Google Scholar
Devedzic, V.: The Semantic Web. In: Tutorial of PAIS Conference (2001)
Google Scholar
Fensel, D., Horrocks, I., van Harmelen, F., Decker, S., Erdmann, M., Klein, M.: OIL in a nutshell. In: Knowledge Acquisition, Modeling, and Management, Proceedings of the European Knowledge Acquisition Conference (2000)
Google Scholar
Heijst, V., Wielinga, S.: Using explicit ontologies in KBS development. International Journal of Human-Computer Studies 45, 183–292 (1997)
Google Scholar
Hemnani, A., Bressan, S.: Extracting Information from Semi-Structured Web Documents. In: Bruel, J.-M., Bellahsène, Z. (eds.) OOIS 2002. LNCS, vol. 2426, pp. 166–175. Springer, Heidelberg (2002)
Chapter Google Scholar
Jicheng, W., Yuan, H., Gangshan, W., Fuyan, Z.: Web Mining: Knowledge Discovery on the Web. In: IEEE SMC 1999 Conference Proceedings, vol. 2 (1999)
Google Scholar
Lee, J.K., Sohn, M.: Extensible Rule Markup Language – toward Intelligent Web Platform. Communications of the ACM 46, 59–64 (2003)
Article Google Scholar
Lee, J.K., Sohn, M.: Extensible Rule Markup Language Version 1.0 specification (2002), http://xrml.kaist.ac.kr
Lee, J.K., Song, Y.U., Kwon, S.B., Kim, W.J., Kim, M.Y.: Rule Syntax of UNIK-BWD. Development of Expert System with UNIK, Bup Young, Ltd., p. 99 (1996)
Google Scholar
Liebowitz, J.: The Handbook of Applied Expert Systems. CRC Press LLC, Boca Raton (1998)
MATH Google Scholar
Moulin, B., Rousseau, D.: Automated Knowledge Acquisition from Regulatory Texts. IEEE Expert (1992)
Google Scholar
Nguyen, T.A., Perkins, W.A.: Knowledge Base Verification. AI Magazine 8(2), 69–75 (1987)
Google Scholar
Schmidt, G., Wetter, T.: Using natural language sources in model-based knowledge acquisition. Data & Knowledge Engineering 26, 327–356 (1998)
Article MATH Google Scholar
Semantic Web: Semantic Web Introduction, Specifications and Related Works (2001), http://www.w3.org/2001/sw/
Torsun, I.S.: Foundations of Intelligent Knowledge-Based Systems. Academic Press, London (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Management Korea Advanced Institute of Science and Technology, 207-43 Cheongryang, Seoul, 130-012, Korea
Juyoung Kang & Jae Kyu Lee

Authors

Juyoung Kang
View author publications
You can also search for this author in PubMed Google Scholar
Jae Kyu Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fraunhofer FIRST IDA group, Kekuléstr. 7, 12489, Berlin,
Michael Schröder
Cottbus University of Technology,
Gerd Wagner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, J., Lee, J.K. (2003). Extraction of Structured Rules from Web Pages and Maintenance of Mutual Consistency: XRML Approach. In: Schröder, M., Wagner, G. (eds) Rules and Rule Markup Languages for the Semantic Web. RuleML 2003. Lecture Notes in Computer Science, vol 2876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39715-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-39715-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20361-2
Online ISBN: 978-3-540-39715-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics