ABSTRACT
Successfully structuring information in databases, OLAP cubes, and XML is a crucial element in managing data nowadays. However this process brought new challenges to usability. It is difficult for users to switch from common communication means using natural language to data models (e.g., database schemas) that are hard to work with and understand, especially for occasional users. This important issue is under intense scrutiny in the database community (e.g., keyword search over databases and query relaxation techniques), and the information extraction community (e.g., linking structured and unstructured data). However, there is still no comprehensive solution that automatically generates an OLAP (Online Analytical Processing) query and chooses a visualization based on textual content with high precision. We present such a method. We discuss how to dynamically generate interpretations of a textual content as an OLAP query, select the best visualization, and retrieve on the fly corresponding data from a data warehouse. To provide the most relevant aggregation results, we consider the user's actual context, described by a document's content. Moreover we provide a prototypical implementation of our method, the Text-To-Query system (T2Q) and show how T2Q can be successfully applied to an enterprise scenario as an extension for an office application.
- Schwarz, S. (2006). A context model for personal knowledge management applications. Modeling and retrieval of context (pp. 18--33). Springer Google ScholarDigital Library
- Blumberg R., Atre S., The problem with unstructured data, in DMReview, February 2003.Google Scholar
- Schlegel, K., Beyer, M. A., & Hostmann, B. (2009). Predicts 2009: Business Intelligence and Performance Management Will Deliver Greater Business Value. Gartner.Google Scholar
- Bhide, M., Chakravarthy, V., Gupta, A., Gupta, H., Mohania, M., Puniyani, K., et al. (2008). Enhanced Business Intelligence using EROCS. International Conference on Data Engineering (ICDE), (pp. 1616--1619). Google ScholarDigital Library
- Howson, C. (2006). BusinessObjects XI (Release 2): The Complete Reference, 1 edition. Google ScholarDigital Library
- Hull, D. (1999). Xerox TREC-8 question answering track report. TREC-8, (pp. 35--56).Google Scholar
- Arasu, A., Chaudhuri, S., & Kaushik, R. (2009). Learning String Transformations From Examples. Very Large Data Bases (VLDB). Lyon. Google ScholarDigital Library
- Elmagarmid, A., Ipeirotis, P., & Verykios, V. (2007). Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge & Data Engineering (TKDE), (pp. 1--16). Google ScholarDigital Library
- Chaudhuri, S., Ganti, V., & Xin, D. (2009). Mining Document Collections to Facilitate Accurate Approximate Entity Matching. Very Large Data Bases (VLDB). Lyon. Google ScholarDigital Library
- Chakaravarthy, V., Gupta, H., Roy, P., & Mohania, M. (2006). Efficiently Linking Text Documents with Relevant Structured Information. Very Large Data Bases (VLDB). Seoul, (pp. 667--678) Google ScholarDigital Library
- Chen, H., Finin, T., & Joshi, A. (2003). An Intelligent Broker for Context Aware Systems. Ubicomp.Google Scholar
- Simitsis, A., Baid, A., Sismanis, Y., & Reinwald, B. (2006). Multidimensional Content eXploration. Very Large Data Bases (VLDB), (pp. 660--671). Google ScholarDigital Library
- Inokuchi, A., & Takeda, K. (2007). A method for online analytical processing of text data. Conference on Information and Knowledge Management (CIKM). Lisboa. (pp. 455--464). Google ScholarDigital Library
- Pérez, J. M., Berlanga, R., Aramburu, M. J., & Pedersen, T. B. (2007). Integrating Data Warehouses with Web Data: A Survey. IEEE Trans. Knowl. Data Eng. 20(7). Google ScholarDigital Library
- Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., & Sudarshan, S. (2002). Keyword Searching and Browsing in Databases using BANKS. International Conference on Data Engineering (ICDE), (pp. 431--440). Google ScholarDigital Library
- Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., & Karambelkar, H. (2005). Bidirectional Expansion For Keyword Search On Graph Databases. Very Large Data Bases (VLDB), (pp. 505--516). Google ScholarDigital Library
- Luo, Y., Lin, X., Wang, W., & Zhou, X. (2007). Spark: top-k keyword query in relational databases. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 563--574). Google ScholarDigital Library
- Li, G., Ooi, B. C., Feng, J., Wang, J., & Zhou, L. (2008). EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 903--914). Google ScholarDigital Library
- Farfan, F., Hristidis, V., Ranganathan, A., & Weiner, M. (2009). XOntoRank: Ontology-Aware Search of Electronic Medical Records. International Conference on Data Engineering (ICDE), (pp. 820--831). Google ScholarDigital Library
- Guo, L., Shao, F., Botev, C., & Shanmugasundaram, J. (2003). XRANK: Ranked Keyword Search over XML Documents. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 16--27). Google ScholarDigital Library
- Liu, F., Yu, C., Meng, W., & Chowdhury, A. (2006). Effective Keyword Search in Relational Databases. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 563--574). Google ScholarDigital Library
Index Terms
- Text-to-query: dynamically building structured analytics to illustrate textual content
Recommendations
Semantics and usage statistics for multi-dimensional query expansion
DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part IIAs the amount and complexity of data keep increasing in data warehouses, their exploration for analytical purposes may be hindered. Recommender systems have grown very popular on the Web with sites like Amazon, Netflix, etc. These systems proved ...
Self-structured data banks semantic integrity and query assistance interface
RIAO '88: User-Oriented Content-Based Text and Image HandlingSIGMINI is an information system designed between retrieval systems and data base managing systems. The documents handled by this system have a flexible data structure which is a non predeclared hierarchical structure. This is necessary to deal with ...
Query recommendations for OLAP discovery driven analysis
DOLAP '09: Proceedings of the ACM twelfth international workshop on Data warehousing and OLAPRecommending database queries is an emerging and promising field of investigation. This is of particular interest in the domain of OLAP systems where the user is left with the tedious process of navigating large datacubes. In this paper we present a ...
Comments