Abstract
This paper proposes a new method for representing document collections with conceptual multidimensional spaces inferred from their contents. Such spaces are built from a set of interesting word co-occurrences, which are properly arranged into taxonomies to define orthogonal hierarchical dimensions. As a result, users can explore and analyze the contents of large document collections by making use of well-known OLAP operators (On-Line Analytic Processing) over these spaces.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising WordNet Domains Hierarchy: Semantics, Coverage, and Balancing. In: Proc. of COLING 2004 Workshop on Multilingual Linguistic Resources, pp. 101–108 (2004)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. Mit Press, Cambridge (1998)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12 (2000)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2000)
Hurtado, C.A., Mendelzon, A.O.: OLAP Dimension Constraints. In: Proc. PODS 2002, pp. 169–179 (2002)
Jarke, M., Lenzerini, M., Vassiliou, Y., Vassiliadis, P.: Fundamentals of DataWarehouses. Springer-Verlang, Heidelberg (2003)
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Proc. of the 14th Conf. on Machine Learning ICML 1997, pp. 143–151 (1997)
Pérez, J.M., Berlanga, R., Aramburu, M.J., Bach, T.: A relevance-extended multi-dimensional model for a data warehouse contextualized with documents. In: Proc. of the 8th ACM International Workshop on Data Warehousing and OLAP. DOLAP 2005, pp. 19–28 (2005)
Pons, A., Berlanga, R., Ruíz-Shulcloper, J.: Un nuevo método de desambiguación del sentido de las palabras usando WordNet. In: Proc. X Conf. de la Asociación para la Inteligencia Artificial CAEPIA 2003, pp. 63–66 (2003)
Pons, A.: Desarrollo de algoritmos para la estructuración dinámica de información y su aplicación a la detección de sucesos. Tesis Doctoral, Depto. Lenguajes y Sistemas Informáticos (2004)
Salton, G.: Automatic Text Processing. Addison-Wesley, Reading (1989)
Webb, G., Zhang, S.: K-Optimal Rule Discovery. Data Mining and Knowledge Discovery 10(1), 39–79 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Danger, R., Berlanga, R. (2006). Inferring Multidimensional Cubes for Representing Conceptual Document Spaces. In: Marín, R., Onaindía, E., Bugarín, A., Santos, J. (eds) Current Topics in Artificial Intelligence. CAEPIA 2005. Lecture Notes in Computer Science(), vol 4177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881216_30
Download citation
DOI: https://doi.org/10.1007/11881216_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45914-9
Online ISBN: 978-3-540-45915-6
eBook Packages: Computer ScienceComputer Science (R0)