Skip to main content

Searching for Complex Patterns over Large Stored Information Repositories

  • Conference paper
Advances in Databases (BNCOD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7051))

Included in the following conference series:

  • 585 Accesses

Abstract

Although Information Retrieval (IR) systems, including search engines, have been effective in locating documents that contain specified patterns from large repositories, they support only keyword searches and queries/patterns that use Boolean operators. Expressive search for complex text patterns is important in many domains such as patent search, search on incoming news, and web repositories. In this paper, we first present the operators and their semantics for specifying an expressive search. We then investigate the detection of complex patterns – currently not supported by search engines – using a pre-computed index, and the type of information needed as part of the index to efficiently detect such complex patterns. We use an expressive pattern specification language and a pattern detection graph mechanism that allows sharing of common sub-patterns. Algorithms have been developed for all the pattern operators using the index to detect complex patterns efficiently. Experiments have been performed to illustrate the scalability of the proposed approach, and its efficiency as compared to a streaming approach.

This work was supported, in part, by the following NSF grants: IIS-0326505, EIA 0216500, and IIS 0534611.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Communications of the ACM 18, 613–620 (1975)

    Article  MATH  Google Scholar 

  2. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of the WWW, Brisbane, Australia, pp. 107–117 (April 1998)

    Google Scholar 

  3. Callan, J., Croft, B., Harding, S.: The inquery retrieval system. In: Proc. of the DEXA, pp. 78–83 (1992)

    Google Scholar 

  4. Turtle, H., Croft, B.: Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems 9, 187–222 (1991)

    Article  Google Scholar 

  5. Elkhalifa, L., Adaikkalavan, R., Chakravarthy, S.: Infofilter: A system for expressive pattern specification and detection over text streams. In: Proc. of the ACM SAC, Santa Fe, NM (March 13-17, 2005)

    Google Scholar 

  6. Chakravarthy, S., Elkhalifa, L., Deshpande, N., Adaikkalavan, R., Liuzzi, R.A.: How To Search for Complex Patterns Over Streaming and Stored Data. In: IC-AI, pp. 17–22 (2006)

    Google Scholar 

  7. Mauldin, M.L.: Lycos : Design choices in an internet search service. IEEE Expert (1997), http://lazytoad.com/lti/pub/ieee97.html

  8. Witten, I., Moffat, A., Bell, T.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kauffman, San Francisco (1999)

    MATH  Google Scholar 

  9. Deshpande, N.: Infosearch : A system for searching and retrieving documents using complex queries, Master’s thesis, University of Texas at Arlington, Arlington (2005), http://itlab.uta.edu/ITLABWEB/Students/sharma/theses/Des05MS.pdf

  10. Java wordnet library, http://sourceforge.net/projects/jwordnet

  11. Berkeley db java edition, http://www.oracle.com/us/products/database/berkeley-db/je/index.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Deshpande, N., Chakravarthy, S., Adaikkalavan, R. (2011). Searching for Complex Patterns over Large Stored Information Repositories. In: Fernandes, A.A.A., Gray, A.J.G., Belhajjame, K. (eds) Advances in Databases. BNCOD 2011. Lecture Notes in Computer Science, vol 7051. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24577-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24577-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24576-3

  • Online ISBN: 978-3-642-24577-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics