skip to main content
10.1145/1947940.1948023acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicccsConference Proceedingsconference-collections
research-article

Cube index for unstructured text analysis and mining

Authors Info & Claims
Published:12 February 2011Publication History

ABSTRACT

A Cube Index Model on multidimensional text database and effective study of Online Analytical Processing (OLAP) over such data had been experimented and found to provide good results. We had proposed a cube index model for unstructured text database derived from the text index structures. There are three kinds of hierarchies on it. They are term hierarchy and dimensional hierarchy. This paper proposes the document hierarchy. Two new operations scroll up and scroll down are discussed exclusively for the cube index. The implementation, OLAP execution and query processing on the index are studied. The performance study gives a good guarantee of the model to be used on unstructured text database.

References

  1. J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, "Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total," in DMKD, 1, 29--53, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Lin C. X., Ding B., Han J., Zhu F., Zhao B., Text Cube: Computing IR Measures for multidimensional Text Database Analysis, Proceedings of the 8th IEEE ICDM, 2008, DOI=http://doi.acm.org/10.1109/ ICDM.2008.135 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cube Index: A Text Index Model for Retrieval and Mining, IJCA 2010, Vol. 1, No. 9 (20), 92--330. DOI=http://www.ijcaonline.org/archives/number9/192-330Google ScholarGoogle Scholar
  4. Wordpair Index: A Nextword Index Structure for Phrase Retrieval, IJRTET, Vol. 3, No. 2, 2010. DOI=http://www.searchdl.org/index.php/journalclient/viewthispaper/6/6/15/131.Google ScholarGoogle Scholar
  5. Frakes W. B. and Yates, R. B. 2008, Information Retrieval Data Structures and Algorithms, Pearson.Google ScholarGoogle Scholar
  6. Zobel, J., Mofiat, A., and Ramamohanarao, K. Inverted files versus signature files for text indexing. ACM Transactions on Database Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Williams H. E., Zobel J., Bahle D. Fast Phrase querying with Combined Indexes, ACM Journal, Vol. V, No. N, 2004, 1--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Salton G. and McGill, M. J., 1983, Introduction to Modern Information Retrieval, McGraw Hill Company. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Williams, H. E., Zobel, J., and Anderson, P. What's next? Index structures for efficient phrase querying. In Proceedings Australasian Database Conference, M. Orlowska, Ed. Springer-Verlag, Auckland, New Zealand, 1999, 141--152.Google ScholarGoogle Scholar
  10. Bahle D., Williams H. E., and Zobel J., Efficient Phrase Querying with an Auxiliary Index, In Proc. ACM-SIGIR Conf. on Research and Development in Information Retrieval, Tampere, Finland, August 2002, 215--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ceglowski, M., Coburn A., and John C. Semantic search of Unstructured Data using Contextual Network Graph, National Institute for Technology and Liberal Education Middlebury College, Middlebury, Vermont, 05753 USA.Google ScholarGoogle Scholar
  12. K. Hammouda and M. Kamel, Document Similarity Using a Phrase Indexing Graph Model, Knowledge and Information Systems. Springer. 2003CA: University Science, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. http://ir.dcs.gla.ac.uk/terrier/Google ScholarGoogle Scholar

Index Terms

  1. Cube index for unstructured text analysis and mining

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICCCS '11: Proceedings of the 2011 International Conference on Communication, Computing & Security
        February 2011
        656 pages
        ISBN:9781450304641
        DOI:10.1145/1947940

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 February 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader