ABSTRACT
A Cube Index Model on multidimensional text database and effective study of Online Analytical Processing (OLAP) over such data had been experimented and found to provide good results. We had proposed a cube index model for unstructured text database derived from the text index structures. There are three kinds of hierarchies on it. They are term hierarchy and dimensional hierarchy. This paper proposes the document hierarchy. Two new operations scroll up and scroll down are discussed exclusively for the cube index. The implementation, OLAP execution and query processing on the index are studied. The performance study gives a good guarantee of the model to be used on unstructured text database.
- J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, "Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total," in DMKD, 1, 29--53, 1997. Google ScholarDigital Library
- Lin C. X., Ding B., Han J., Zhu F., Zhao B., Text Cube: Computing IR Measures for multidimensional Text Database Analysis, Proceedings of the 8th IEEE ICDM, 2008, DOI=http://doi.acm.org/10.1109/ ICDM.2008.135 Google ScholarDigital Library
- Cube Index: A Text Index Model for Retrieval and Mining, IJCA 2010, Vol. 1, No. 9 (20), 92--330. DOI=http://www.ijcaonline.org/archives/number9/192-330Google Scholar
- Wordpair Index: A Nextword Index Structure for Phrase Retrieval, IJRTET, Vol. 3, No. 2, 2010. DOI=http://www.searchdl.org/index.php/journalclient/viewthispaper/6/6/15/131.Google Scholar
- Frakes W. B. and Yates, R. B. 2008, Information Retrieval Data Structures and Algorithms, Pearson.Google Scholar
- Zobel, J., Mofiat, A., and Ramamohanarao, K. Inverted files versus signature files for text indexing. ACM Transactions on Database Systems, 2002. Google ScholarDigital Library
- Williams H. E., Zobel J., Bahle D. Fast Phrase querying with Combined Indexes, ACM Journal, Vol. V, No. N, 2004, 1--21.Google ScholarDigital Library
- Salton G. and McGill, M. J., 1983, Introduction to Modern Information Retrieval, McGraw Hill Company. Google ScholarDigital Library
- Williams, H. E., Zobel, J., and Anderson, P. What's next? Index structures for efficient phrase querying. In Proceedings Australasian Database Conference, M. Orlowska, Ed. Springer-Verlag, Auckland, New Zealand, 1999, 141--152.Google Scholar
- Bahle D., Williams H. E., and Zobel J., Efficient Phrase Querying with an Auxiliary Index, In Proc. ACM-SIGIR Conf. on Research and Development in Information Retrieval, Tampere, Finland, August 2002, 215--221. Google ScholarDigital Library
- Ceglowski, M., Coburn A., and John C. Semantic search of Unstructured Data using Contextual Network Graph, National Institute for Technology and Liberal Education Middlebury College, Middlebury, Vermont, 05753 USA.Google Scholar
- K. Hammouda and M. Kamel, Document Similarity Using a Phrase Indexing Graph Model, Knowledge and Information Systems. Springer. 2003CA: University Science, 1989. Google ScholarDigital Library
- http://ir.dcs.gla.ac.uk/terrier/Google Scholar
Index Terms
- Cube index for unstructured text analysis and mining
Recommendations
A Multi-dimensional Analysis and Data Cube for Unstructured Text and Social Media
BDCLOUD '14: Proceedings of the 2014 IEEE Fourth International Conference on Big Data and Cloud ComputingRecently, unstructured data like texts, documents, or SNS messages has been increasingly being used in many applications, rather than structured data consisting of simple numbers or characters. Thus it becomes more important to analysis unstructured ...
A New Bitmap Index and a New Data Cube Compression Technology
ICCSA '08: Proceedings of the international conference on Computational Science and Its Applications, Part IIThis paper introduces a new kind of bitmap index. A tuple in the data cube is mapped to a sequential key (seqkey) determined by its value in each dimension. Furthermore, the quotient bit sequence is constructed according to whether the corresponding ...
Semi-closed cube: an effective approach to trading off data cube size and query response time
The results of data cube will occupy huge amount of disk space when the base table is of a large number of attributes. A new type of data cube, compact data cube like condensed cube and quotient cube, was proposed to solve the problem. It compresses ...
Comments