Skip to main content

Browsing Semi-structured Web texts using Formal Concept Analysis

  • Conference paper
  • First Online:
Conceptual Structures: Broadening the Base (ICCS 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2120))

Included in the following conference series:

Abstract

Query-directed browsing of unstructured Web-texts using Formal Concept Analysis (FCA) confronts two problems. Firstly on-line Web-data is sometimes unstructured and any FCA-system must include additional mechanisms to structure input sources. Secondly many online collections are large and dynamic so a Web-robot must be used to automatically extract data. These issues are addressed in this paper. We report on the construction of a Web-based FCA system for browsing classified advertisements for real-estate properties 1. Real-estate advertisements were chosen because they are typical of semi-structured textual information sources accessible on the Web. Furthermore, the analysis of real-estate data using FCA is a classic example used in introductory courses on FCA. However, unlike the classic FCA real-estate example, whose input is a structure relational database, we automatically mine Web-based texts for their structure.

In FCA the word property has a meaning similar to attribute. In this paper property is only be used with the meaning of real-estate property, e.g. a house or apartment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cole, R and P. Eklund, Analyzing an Email Collection using Formal Concept Analysis, European Conf. on Knowledge and Data Discovery, PKDD’99. pp. 309–315, LNAI 1704, 1999. Springer Verlag, 1999.

    Google Scholar 

  2. Cole, R. and G. Stumme: CEM: A Conceptual Email Manager Proceeding of the 8th International Conf. on Conceptual Structures, ICCS’00, LNAI 1867, pp. 438-452, Springer Verlag, 2000.

    Google Scholar 

  3. Cole, R, P. Eklund, and G. Stumme, CEM — A Program for Visualization and Discovery in Email, In D.A. Zighed, J. Komorowski, J. Zytkow (Eds), Proc. of PKDD’00, LNAI 1910, pp. 367–374, Springer-Verlag, 2000

    Google Scholar 

  4. Cole, R., P. Eklund and Å. Strand, Recovering Structure from Unstructured Webaccessible Rental Classifieds, Proceedings of the 4th Australian Document Computing Symposium (ADCS’00), DSTC Pty Ltd, pp. 131–140, 2000.

    Google Scholar 

  5. Ganter, B and R. Wille: Formal Concept Analysis: Mathematical Foundations Springer Verlag, 1999.

    Google Scholar 

  6. E. Horvitz, Uncertainty, Action and Interaction: In pursuit of Mixed-initiative Computing Intelligent Systems IEEE, pp. 17–20, September, 1999 http://research.microsoft.com/horvitz/mixedinit.HTM

  7. Vogt, F. C. Wachter and R. Wille, Data Analysis based on a Conceptual File, Classification, Data Analysis and Knowledge Organization, Hans-Hermann Bock, W. Lenski and P. Ihm (Eds), pp. 131–140, Springer Verlag, Berlin, 1991.

    Google Scholar 

  8. Vogt, F. and R. Wille, TOSCANA: A Graphical Tool for Analyzing and Exploring Data In: R. Tamassia, I.G. Tollis (Eds) Graph Drawing’ 94, LNCS 894 pp. 226–233, 1995.

    Google Scholar 

  9. Wille, R. Conceptual Landscapes of Knowledge: A Pragmatic Paradigm for Knowledge Processing In: W. Gaul, H. Locarek-Junge (Eds) Classification in the Information Age, Springer, Heidelberg, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cole, R., Eklund, P. (2001). Browsing Semi-structured Web texts using Formal Concept Analysis. In: Delugach, H.S., Stumme, G. (eds) Conceptual Structures: Broadening the Base. ICCS 2001. Lecture Notes in Computer Science(), vol 2120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44583-8_23

Download citation

  • DOI: https://doi.org/10.1007/3-540-44583-8_23

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42344-7

  • Online ISBN: 978-3-540-44583-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics