Skip to main content
Log in

Exploratory search framework for Web data sources

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Exploratory search is an information seeking behavior where users progressively learn about one or more topics of interest; it departs quite radically from traditional keyword-based query paradigms, as it combines querying and browsing of resources, and covers activities such as investigating, evaluating, comparing, and synthesizing retrieved information. In most cases, such activities are enabled by a conceptual description of information in terms of entities and their semantic relationships. Customized Web applications, where few applicative entities and their relationships are embedded within the application logics, typically provide some support to exploratory search, which is, however, specific for a given domain. In this paper, we describe a general-purpose exploratory search framework, i.e., a framework which is neutral to the application logic. Our contribution consists of the formalization of the exploratory search paradigm over Web data sources, accessed by means of services; extracted information is described by means of an entity-relationship schema, which masks the service implementations. Exploratory interaction is supported by a general-purpose user interface including a set of widgets for data exploration, from big tables to atomic tables, visual diagrams, and geographic maps; the user interaction is translated to queries defined in \(\mathcal S \hbox {e}\mathcal C \hbox {oQL}\), a SQL-like language and protocol specifically designed for supporting exploratory search over data sources. We illustrate the software architecture of our prototype, which uses the interplay of a query and result management system with an orchestrator, capable of incrementally building queries and of walking through the past navigation history. The distinctive feature of the framework is the ability to extract top solutions, which combine top-ranked entity instances. We evaluate exploratory search from the end-user perspective in the context of a cognitive model for search, by studying the user’s behavior and the effectiveness of exploratory search in terms of quality of results produced by the search process; we also compare the effectiveness of interaction in using our multi-domain search system with the use of various replicas of the system, each acting upon a single domain, and with the use of conventional search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. Although some recent versions of such engines recognize the importance of entity-based search exploration, e.g., see Google Knowledge Graph.

  2. The software described in this section is available for download at www.search-computing.org, both as executable code and as open-source code.

References

  1. Baeza-Yates, R.: Applications of Web query mining. In: Losada, D., Fernandez-Luna, J. (eds.) Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 3408, pp. 7–22. Springer, Berlin/Heidelberg (2005)

  2. Bates, M.J.: Information search tactics. J. Am. Soc. Inf. Sci. 30(4), 205–214 (1979)

    Article  Google Scholar 

  3. Bates, M.J.: The design of browsing and berrypicking techniques for the online search interface. Online Review 13(5), 407–424 (1989). http://www.gseis.ucla.edu/faculty/bates/berrypicking.html

  4. Belkin, N.J., Cool, C., Stein, A., Thiel, U.: Cases, scripts, and information-seeking strategies: on the design of interactive information retrieval systems. Expert Syst. Appl. 9(3), 379–395 (1995)

    Article  Google Scholar 

  5. Bellahsene, Z., Bonifati, A., Rahm, E.: Schema Matching and Mapping. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  6. Bergamaschi, S., Po, L., Sorrentino, S., Corni, A.: Uncertainty in Data Integration Systems: Automatic Generation of Probabilistic Relationships. Springer, Berlin (2010)

    Google Scholar 

  7. Bozzon, A., Brambilla, M., Catarci, T., Ceri, S., Fraternali, P., Matera, M.: Visualization of multi-domain ranked data. In: Ceri, S., Brambilla, M. (eds.) Search Computing, pp. 53–69. Springer, Berlin, Heidelberg (2011). http://dl.acm.org/citation.cfm?id=1983774.1983782

  8. Bozzon, A., Brambilla, M., Ceri, S., Fraternali, P.: Liquid query: multi-domain exploratory search on the Web. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 161–170. ACM, New York (2010)

  9. Braga, D., Ceri, S., Corcoglioniti, F., Grossniklaus, M.: Panta rhei: flexible execution engine for search computing queries. In: Ceri, S., Brambilla, M. (eds.) Search Computing, pp. 225–243. Springer, Berlin, Heidelberg (2010). http://dl.acm.org/citation.cfm?id=2172319.2172334

  10. Braga, D., Ceri, S., Daniel, F., Martinenghi, D.: Optimization of multi-domain queries on the Web. Proc. VLDB Endow. 1(1), 562–573 (2008)

    Google Scholar 

  11. Brambilla, M., Campi, A., Ceri, S., Quarteroni, S.: Semantic Resource Framework, LNCS, vol. 6585 (2011)

  12. Broder, A.: A taxonomy of Web search. SIGIR Forum 36(2), 3–10 (2002)

    Article  Google Scholar 

  13. Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: View-based query answering in description logics: semantics and complexity. Comput. Syst. Sci. 78(1), 26–46 (2012)

    Article  MATH  Google Scholar 

  14. Capra, R.G., Marchionini, G.: The relation browser tool for faceted exploratory search. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’08), pp. 420–420. ACM, New York, (2008). doi: 10.1145/1378889.1378967

  15. Ceri, S., Bozzon, A., Brambilla, M.: The anatomy of a multi-domain search infrastructure. In: Auer, S., Daz, O., Papadopoulos, G. (eds.) Web Engineering, Lecture Notes in Computer Science, vol. 6757, pp. 1–12. Springer, Berlin/Heidelberg (2011)

  16. Choi, N., Song, I.Y., Han, H.: A survey on ontology mapping. SIGMOD Rec. 35(3), 34–41 (2006)

    Article  Google Scholar 

  17. Ciglan, M., Nor\(\dot{\text{a}}\)vg, K., Hluchy, L.: The SemSets model for ad-hoc semantic list search. In: Proceedings of WWW, pp. 131–140. New York (2012)

  18. Dalvi, N., Kumar, R., Pang, B., Ramakrishnan, R., Tomkins, A., Bohannon, P., Keerthi, S., Merugu, S.: A Web of concepts. In: Proceedings of PODS, pp. 1–12. ACM (2009)

  19. Doan, A., Halevy, A., Ives, Z.: Principles of Data Integration. Morgan Kauffman, San Francisco, CA (2012)

    Google Scholar 

  20. Doan, A., Halevy, A.Y.: Semantic integration research in the database community: a brief survey. AI Mag. 26(1), 83–94 (2005)

    Google Scholar 

  21. Dong, X., Halevy, A., Madhavan, J., Nemes, E., Zhang, J.: Similarity search for Web services. In: Proceedings of VLDB, pp. 372–383 (2004)

  22. Fazzinga, B., Lukasiewicz, T.: Semantic search on the Web. Semant. Web 1(1–2), 89–96 (2010)

    Google Scholar 

  23. Foster, H., Uchitel, S., Magee, J., Kramer, J.: Model-based verification of Web service compositions. In: Proceedings of Automated Software Engineering, pp. 152–161 (2003)

  24. Golovchinsky, G., Dunnigan, A., Diriye, A.: Designing a tool for exploratory information seeking. In: Proceedings of the 2012 ACM Annual Conference Extended Abstracts on Human Factors in Computing Systems Extended Abstracts, CHI EA ’12, pp. 1799–1804. ACM, New York (2012)

  25. Granitzer, M., Sabol, V., Onn, K.W., Lukose, D., Tochtermann, K.: Ontology alignment: a survey with focus on visually supported semi-automatic techniques. Future Internet 2(3), 238–258 (2010)

    Article  Google Scholar 

  26. Hearst, M.A.: Search User Interfaces, 1 edn. Cambridge University Press, Cambridge (2009). http://searchuserinterfaces.com/book/

  27. Herzig, D.M., Tran, T.: Heterogeneous Web data search using relevance-based on the fly data integration. In: Proceedings of WWW, pp. 141–150. New York (2012)

  28. Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., de Melo, G., Weikum, G.: Yago2: exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW ’11, pp. 229–232. ACM, New York (2011)

  29. Jansen, B.J., Pooch, U.: A review of Web searching studies and a framework for future research. J. Am. Soc. Inf. Sci. Technol. 52(3), 235–246 (2001)

    Article  Google Scholar 

  30. Kuhlthau, C.C.: Inside the search process: information seeking from the user’s perspective. J. Am. Soc. Inf. Sci. 42(5), 361–371 (1991)

    Article  Google Scholar 

  31. Kules, B., Capra, R., Banta, M., Sierra, T.: What do exploratory searchers look at in a faceted search interface? In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ’09, pp. 313–322. ACM, New York (2009)

  32. Kumar, R., Tomkins, A.: A characterization of online browsing behavior. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 561–570. ACM, New York (2010)

  33. Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of PODS, pp. 233–246. ACM (2002)

  34. Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM 49, 41–46 (2006)

    Article  Google Scholar 

  35. Pirolli, P., Card, S.K.: Information foraging. Psychol. Rev. 106, 643–675 (1999)

    Article  Google Scholar 

  36. Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the Web of data. In: Proceedings of WWW, pp. 771–780. New York (2010)

  37. Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by Web services. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD ’10, pp. 399–410. ACM, New York (2010)

  38. Quarteroni, S., Brambilla, M., Ceri, S.: A bottom-up, knowledge-aware approach to the integration of Web data services. ACM Trans. Web (TWEB) (to appear)

  39. Quarteroni, S., Guerrisi, V., Torre, P.L.: Evaluating multi-focus natural language queries over data services. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), Istanbul (2012)

  40. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  41. Rajaraman, A., Sagiv, Y., Ullman, J.D.: Answering queries using templates with binding patterns (extended abstract). In: Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’95, pp. 105–112. ACM, New York (1995)

  42. Rose, D.E.: The information-seeking funnel. In: Marchionini, G., White, R. (eds.) National Science Foundation Workshop on Information-Seeking Support Systems (ISSS), Chapel Hill, NC (2008)

  43. Rose, D.E., Levinson, D.: Understanding user goals in Web search. In: Proceedings of the 13th International Conference on World Wide Web, WWW ’04, pp. 13–19. ACM, New York (2004)

  44. Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. In: Proceedings of the Annual Meeting of the American Society for Information Science (ASIS’97), pp. 313–327 (1997)

  45. Suchanek, F., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of WWW, pp. 697–706 (2007)

  46. Suchanek, F.M., Bozzon, A., Valle, E.D., Campi, A., Ronchi, S.: Towards an ontological representation of services in search computing. In: Search Computing—Trends and Developments, LNCS, vol. 6585, pp. 101–112. Springer, Berlin (2011)

  47. Tzitzikas, Y., Hainaut, J.L.: How to tame a very large ER diagram (using link analysis and force-directed drawing algorithms). In: ER, pp. 144–159 (2005)

  48. Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) Proceedings of ICDT, LNCS, vol. 1186, pp. 19–40. Springer, Berlin (1997)

  49. White, R.W., Drucker, M., Marchionini, G., Hearst, M., Schraefel, M.C.: Exploratory search and HCI: designing and evaluating interfaces to support exploratory search interaction. In: Proceedings of the ACM SIGCHI 2007 Workshop (2007)

  50. White, R.W., Marchionini, G., Muresan, G.: Evaluating exploratory search systems: introduction to special topic issue of information processing and management. Inf. Process. Manag. 44(2), 433–436 (2008)

    Article  Google Scholar 

  51. White, R.W., Muresan, G., Marchionini, G.: Report on acm sigir 2006 workshop on evaluating exploratory search systems. SIGIR Forum 40(2), 52–60 (2006). http://portal.acm.org/citation.cfm?id=1189702.1189711

  52. White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, San Rafael, CA (2009)

  53. Wilson, M.L., Schraefel, M.C.: Evaluating collaborative search interfaces with information seeking theory. In: Proceedings of 1st International Collaborative Search Workshop (2008)

  54. Wilson, M.L., Schraefel, M.C.: Sii: the lightweight analytical search interface inspector. In: Proceedings of JCDL09 Workshop on Lightweight User-Friendly Evaluation Methods for Digital Librarians, vol. 42(5) (2009)

  55. Yogev, S., Roitman, H., Carmel, D., Zwerdling, N.: Towards expressive exploratory search over entity-relationship data. In: Proceedings of the 21st International Conference Companion on World Wide Web, WWW ’12 Companion, pp. 83–92. ACM, New York (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Brambilla.

Additional information

This work has been done in the context of the Search Computing (SeCo) research project funded by the European Research Council (ERC) IDEAS Advanced Grants.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bozzon, A., Brambilla, M., Ceri, S. et al. Exploratory search framework for Web data sources. The VLDB Journal 22, 641–663 (2013). https://doi.org/10.1007/s00778-013-0326-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-013-0326-x

Keywords

Navigation