skip to main content
10.1145/1754239.1754255acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Text-to-query: dynamically building structured analytics to illustrate textual content

Published:22 March 2010Publication History

ABSTRACT

Successfully structuring information in databases, OLAP cubes, and XML is a crucial element in managing data nowadays. However this process brought new challenges to usability. It is difficult for users to switch from common communication means using natural language to data models (e.g., database schemas) that are hard to work with and understand, especially for occasional users. This important issue is under intense scrutiny in the database community (e.g., keyword search over databases and query relaxation techniques), and the information extraction community (e.g., linking structured and unstructured data). However, there is still no comprehensive solution that automatically generates an OLAP (Online Analytical Processing) query and chooses a visualization based on textual content with high precision. We present such a method. We discuss how to dynamically generate interpretations of a textual content as an OLAP query, select the best visualization, and retrieve on the fly corresponding data from a data warehouse. To provide the most relevant aggregation results, we consider the user's actual context, described by a document's content. Moreover we provide a prototypical implementation of our method, the Text-To-Query system (T2Q) and show how T2Q can be successfully applied to an enterprise scenario as an extension for an office application.

References

  1. Schwarz, S. (2006). A context model for personal knowledge management applications. Modeling and retrieval of context (pp. 18--33). Springer Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blumberg R., Atre S., The problem with unstructured data, in DMReview, February 2003.Google ScholarGoogle Scholar
  3. Schlegel, K., Beyer, M. A., & Hostmann, B. (2009). Predicts 2009: Business Intelligence and Performance Management Will Deliver Greater Business Value. Gartner.Google ScholarGoogle Scholar
  4. Bhide, M., Chakravarthy, V., Gupta, A., Gupta, H., Mohania, M., Puniyani, K., et al. (2008). Enhanced Business Intelligence using EROCS. International Conference on Data Engineering (ICDE), (pp. 1616--1619). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Howson, C. (2006). BusinessObjects XI (Release 2): The Complete Reference, 1 edition. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hull, D. (1999). Xerox TREC-8 question answering track report. TREC-8, (pp. 35--56).Google ScholarGoogle Scholar
  7. Arasu, A., Chaudhuri, S., & Kaushik, R. (2009). Learning String Transformations From Examples. Very Large Data Bases (VLDB). Lyon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Elmagarmid, A., Ipeirotis, P., & Verykios, V. (2007). Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge & Data Engineering (TKDE), (pp. 1--16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chaudhuri, S., Ganti, V., & Xin, D. (2009). Mining Document Collections to Facilitate Accurate Approximate Entity Matching. Very Large Data Bases (VLDB). Lyon. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chakaravarthy, V., Gupta, H., Roy, P., & Mohania, M. (2006). Efficiently Linking Text Documents with Relevant Structured Information. Very Large Data Bases (VLDB). Seoul, (pp. 667--678) Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chen, H., Finin, T., & Joshi, A. (2003). An Intelligent Broker for Context Aware Systems. Ubicomp.Google ScholarGoogle Scholar
  12. Simitsis, A., Baid, A., Sismanis, Y., & Reinwald, B. (2006). Multidimensional Content eXploration. Very Large Data Bases (VLDB), (pp. 660--671). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Inokuchi, A., & Takeda, K. (2007). A method for online analytical processing of text data. Conference on Information and Knowledge Management (CIKM). Lisboa. (pp. 455--464). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Pérez, J. M., Berlanga, R., Aramburu, M. J., & Pedersen, T. B. (2007). Integrating Data Warehouses with Web Data: A Survey. IEEE Trans. Knowl. Data Eng. 20(7). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., & Sudarshan, S. (2002). Keyword Searching and Browsing in Databases using BANKS. International Conference on Data Engineering (ICDE), (pp. 431--440). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., & Karambelkar, H. (2005). Bidirectional Expansion For Keyword Search On Graph Databases. Very Large Data Bases (VLDB), (pp. 505--516). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Luo, Y., Lin, X., Wang, W., & Zhou, X. (2007). Spark: top-k keyword query in relational databases. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 563--574). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Li, G., Ooi, B. C., Feng, J., Wang, J., & Zhou, L. (2008). EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 903--914). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Farfan, F., Hristidis, V., Ranganathan, A., & Weiner, M. (2009). XOntoRank: Ontology-Aware Search of Electronic Medical Records. International Conference on Data Engineering (ICDE), (pp. 820--831). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Guo, L., Shao, F., Botev, C., & Shanmugasundaram, J. (2003). XRANK: Ranked Keyword Search over XML Documents. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 16--27). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Liu, F., Yu, C., Meng, W., & Chowdhury, A. (2006). Effective Keyword Search in Relational Databases. ACM Special Interest Group on Management Of Data (SIGMOD), (pp. 563--574). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Text-to-query: dynamically building structured analytics to illustrate textual content

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            EDBT '10: Proceedings of the 2010 EDBT/ICDT Workshops
            March 2010
            290 pages
            ISBN:9781605589909
            DOI:10.1145/1754239

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 22 March 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate7of10submissions,70%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader