Skip to main content
Log in

The Layer-Seeds Term Clustering Method: Enabling Proactive Situation-Aware Product Recommendations in E-Commerce Dialogues

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

In e-commerce it is often crucial to provide customers a large choice of relevant offers. Users, however, seldom provides complete and comprehensive descriptions of their desires, therefore user interfaces are needed that can generate automatically expanded queries to the product database and proactively enrich the ongoing dialogue with recommendations of suitable products. Automatic query expansion is mostly based on thesaurus and/or user profiles. In e-commerce applications, specific thesauri reflecting the webstore's product categories are desirable. This work describes a method for the automatic construction of a thesaurus based on existing categories of documents. A clustering algorithm, the “Layer-Seeds method'', is introduced, which facilitates the automatic generation of thesaurus reflecting the specific vocabulary occurring in a given collection of documents. The clustering works on terms extracted from the documents in a certain category and organizes them in a tree-like hierarchical structure—a thesaurus. The thesaurus is then employed for automatic query expansion in an e-commerce application in order to obtain better results for product searching. Experiments yield evidence that a significant increase of user satisfaction is achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Andersen V, Andersen HHK. Evaluation of the COGITO system. Deliverable 7.2, IST-1999-13347, Risoe National Laboratory, DK, 2002.

    Google Scholar 

  • Buckley C, Salton G, Allan J, Singhal A. Automatic query expansion using SMART: TREC 3. In: Proceedings of the Third Text REtrieval Conference (TREC-3). NIST Special Publication 500–225, 1995:69–80.

  • Croft WB, Cook R, Wilder D. Providing government information on the internet: Experiences with Thomas. In: Proceeding of Digital Libraries Conference, 1995:19–24.

  • Crouch CJ, Yong B, Experiments in automatic statistical thesaurus construction, SIGIR'92, In: Proceedings of the 15th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, June 21–24, Copenhagen, Denmark, 1992:77–88.

  • Deerwester S, Dumais ST, Landauer TK, Furnas GW, Harshman RA. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990;41:391–407.

    Article  Google Scholar 

  • Furnas GW, Landauer TK, Gomez LM, Dumais ST. The vocabulary problem in human-system communication. In: Communications of the ACM, 1987;30:964–971.

    Article  Google Scholar 

  • Jing YF, Croft WB. An association thesaurus for information retrieval. In: RIAO 94 Conference Proceedings, New York, October 1994:146–160.

  • Kilgariff A. Thesauruses for Natural Language Processing. Technical Report Series: ITRI-03-15, ITRI, Univ. of Brighton, 2003.

  • Kowalski G. Information Retrieval Systems: Theory and Implementation. Kluwer Academic Publishers: Boston/Dordrecht/London, 1997.

    Google Scholar 

  • L'Abbate M, Thiel U. Chatterbots and intelligent information search. In: Proceedings of the BCS-IRSG 23rd European Colloquium on Information Retrieval Research, Darmstadt, Germany, 2001:200–207;.

  • L'Abbate M, Thiel U. Helping conversational agents to find informative responses: Query expansion methods for chatterbots. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 2, Bologna, Italy, 796–797;2002, ISBN:1-58113-480-0.

  • L'Abbate M, Thiel U, Andersen HHK, Andersen V, Improving agent technology: An intelligent search for product information. In Stanford-Smith B, Chiozza E, Edin M. (eds.), Challenges and achievements in Ebusiness and E-work. Part 1. E-2002; E-Business and E-Work Annual Conference, Prague (CZ), 16–18 Oct. 2002. IOS Press, Amsterdam, 2002:772–779.

  • Qiu Y, Frei HP. Concept based query expansion. In: Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, ACM Press, 1993:160–170.

  • Rocchio JY. Relevance Feedback in Information Retrieval. The SMART Retrieval System. Prentice Hall: Engelwood Cliff, N.J., 1971:313–323.

    Google Scholar 

  • Salton G, Buckley C. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 1990;41(4):288–297.

    Article  Google Scholar 

  • Semeraro G, Degemmis M, Lops P, Thiel U, L'Abbate M. A personalized information search process based on dialoguing agents and user profiling. In: F. Sebastiani ed., ECIR-03: Proceedings of the 25th European Conference on Information Retrieval Research, Lecture Notes in Computer Science 2633, Springer: Berlin, 2003:613–621.

  • Sparck-Jones K. Automatic Keyword Classification for Information Retrieval. Butterworth, London, 1971.

    Google Scholar 

  • Stein Adelheit, Gulla Jon Atle, Thiel Ulrich. User-tailored planning of mixed initiative information-seeking dialogues. User Modeling and User-Adapted Interaction 9(1–2):133–166;1999.

    Google Scholar 

  • Thiel U, L'Abbate M, Paradiso A, Stein A, Semeraro G, Abbattista F, Lops P. The COGITO project: Intelligent E-commerce with guiding agents based on personalized interaction tools. In: J. Gasós and K.-D. Thoben eds., E-Business Applications: Technologies for Tomorrow's Solutions, Part II: Advanced E-Commerce Applications, Springer, 2003;5:61–76.

  • Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco, Morgan Kaufmann Publishers, CA. 1999.

  • Xu J, Croft WB. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Informaiton Systems, 2000;18(1):79–112.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Libo Chen.

Additional information

This paper is a revision and expansion of the paper “Increasing the Customer's Choice: Query Expansion based on the Layer-Seeds Method and its Application in E-Commerce” for the IEEE international conference on e-Technology, e-Commerce and e-Service (EEE-04).

Libo Chen is a computer scientist at the FhG-IPSI (Integrated Publication and Information Systems Institute) in Darmstadt, Germany. Within FhG-IPSI, he is now working for the ORION division (Digital Asset Management), which is concerned with knowledge management, thesaurus/ontology construction, information retrieval and intelligent agents. His main research interests include automatic thesaurus construction, query expansion, information extraction and knowledge management. Mr. Libo Chen received his bachelor's degree in Management Information System from the Tsinghua University in Peking, China in 1996 and master's degree in Economic Computer Science from the Darmstadt University of Technology, Germany in 2001.

Marcello L'Abbate is a computer scientist who graduated from the Darmstadt University of Technology at the department of computer science in October 2000. His master thesis was created at the FhG-IPSI (Integrated Publication and Information Systems Institute), and dealt with the development of a Query Construction Tool for supporting the generation of structured queries as a basis for a metaphor-based interactive information visualisation system. Within FhG-IPSI, he is now working for the ORION division (Digital Asset Management), which is concerned with knowledge systems, intelligent agents, cooperative user interfaces and dialogue planning. His main research interests include component-based software engineering, XML-based content retrieval and the development of “intelligent' user interfaces.

Ulrich Thiel is a senior researcher at the Fraunhofer Integrated Publication and information Systems Institute (IPSI) in Darmstadt. Within FhG IPSI (formerly GMD IPSI) he is in the ORION division, which is concerned with metadata applications in publishing processes (e.g. content syndication, single source publishing), intelligent information retrieval, pro-active agent-based user interfaces, and collaboration systems. Currently, he is in charge of the division's competence group “Search and Retrieval Technologies''. Dr. Thiel received his diploma in Computer Science from the University of Dortmund, and his Ph.D. in Information Science from the University of Konstanz. Until 1988 he was a researcher and lecturer at the Information Science department at the University of Konstanz. Since 1990 he has been a researcher and manager of several projects and research groups within Fraunhofer IPSI. His primary research interests are in intelligent multimedia information retrieval, logic-based retrieval mechanisms, intelligent user interfaces, and dialogue and user modelling. He has been coordinator of several European R&D projects (e.g. COGITO) and program committee member of numerous national and international conferences (eg. ACMSIGIR, ECIR, CIKM, ADL, ECDL).

Erich Neuhold received his M.S. in Electronics and his Ph.D. degree in Mathematics and Computer Science at the Technical University of Vienna, Austria, in 1963 and 1967, respectively. Since 1986 he has been Director of the Institute for Integrated Publication and Information Systems (IPSI) in Darmstadt, Germany (a former Institute of the German National Research Center for Information Technology - GMD, since July 2001 a Fraunhofer Institute). He is a member of many professional societies, an IEEE senior member, and currently holds the chairs of the IEEE-CS the Technical Committee on Data Engineering. Dr. Neuhold is also Professor of Computer Science, Integrated Publication and Information Systems, at the Darmstadt University of Technology, Germany. His primary research and development interests are in heterogeneous multimedia database systems in Peer-to-Peer and GRID environments, WEB technologies and persistent information and knowledge repositories and content engineering. In content engineering special emphasis is given to all technological aspects of the publishing value chain that arise for digital products in the WEB context. Search, access and delivery includes semantic based retrieval of multimedia documents. He also guides research and development in user interfaces including virtual reality concepts for information visualization, computer supported cooperative work, ambient intelligence, mobile and wireless technology, security in the WEB and applications like e-learning, e-commerce, e-culture and e-government.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L., L'Abbate, M., Thiel, U. et al. The Layer-Seeds Term Clustering Method: Enabling Proactive Situation-Aware Product Recommendations in E-Commerce Dialogues. Inf Syst Front 7, 405–419 (2005). https://doi.org/10.1007/s10796-005-4811-7

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-005-4811-7

Navigation