Skip to main content

An Extraction Method to Get a Municipality Event Information

  • Conference paper
Book cover Computational Science and Its Applications – ICCSA 2010 (ICCSA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6019))

Included in the following conference series:

Abstract

It is an investigative purpose to acquire information on the event information page that exists in the municipality website in the form of a possible machine process. In this paper, we propose an extraction method from a HTML document based on dictionary.HTML tag is deleted from the HTML document and it converts it into the text. And, it proposes the method for extracting a target character string by comparing the text with the collection of words prepared beforehand. The evaluation experiment was done to the municipality in 23 Tokyo district and 56 Chiba prefecture in Japan. The proposal method was able to extract event information on as a whole 73%. The LR-Wrapper was 52%. The Tree-Wrapper was 55%. The PLR-Wrapper was 32%. The proposal method confirmed event information was rating higher than an existing method extractive by the combination of a simple algorithm and the collection of words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Noguchi, R., Yamada, Y., Ikeda, D.: Template Rxtraction from Web Documents using Substring Amplification. In: DEWS (2004)

    Google Scholar 

  2. Kushmerick, N.: Wrapper induction: Efficiency and Expressiveness. Artificial Intelligence 118(1-2), 15–68 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  3. Yshitsugu, M., Hiroshi, S., Hiroki, A., Arikawa, S.: Extracting Text Data from HTML Documents. Information Processing Society of Japan 42(14), 39–49 (2001)

    Google Scholar 

  4. Yamada, Y., Ikeda, D., Sachio, H.: Automatic Tree and String Based Wrapper Generation for semi-structured Documents. IPSJ SIG Notes 2003 98, 115–122 (2003)

    Google Scholar 

  5. Yukio, U., Toshio, U., Ryoji, K., Tohgoro, M., Ohwada, H.: Information Extraction Using Specic Rule Wrapper Array. IPSJ SIG Notes, 117–123 (2007)

    Google Scholar 

  6. Masayuki, U., Koji, I., Hirokazu, N.: A Case-Based Semi-automatic Transformation from HTML Documents to XML Ones - Using the Similarity between HTML Documents Constituting a Series. Journal of Japanese Society for Artificial intelligence 16(5), 408–416 (2001)

    Google Scholar 

  7. Sen home, https://sen.dev.java.net/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ushioda, T., Fujita, S. (2010). An Extraction Method to Get a Municipality Event Information. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2010. ICCSA 2010. Lecture Notes in Computer Science, vol 6019. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12189-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12189-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12188-3

  • Online ISBN: 978-3-642-12189-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics