Elsevier

Computers in Industry

Volume 131, October 2021, 103497
Computers in Industry

An evaluation and annotation methodology for product category matching in e-commerce

https://doi.org/10.1016/j.compind.2021.103497Get rights and content

Highlights

  • Product category matching is an important task in digital marketplaces and e-commerce.

  • This paper motivates, describes and formalizes the problem of product category matching.

  • The paper also presents a rigorously designed methodology and guidelines for acquiring reliable and cost-effective annotations for this task.

  • The utility of all methods presented is validated on three real-world e-commerce taxonomies.

Abstract

Product category matching is an important task in digital marketplaces and e-commerce, helping to power better search and recommendations in an online context. While variants of the problem have received some attention in academia, there is no documented guidance on how to efficiently acquire annotations for evaluating multiple (current and future) models, many of which rely on modern machine learning techniques such as neural representation learning. In this paper, we motivate and formalize the problem of product category matching in e-commerce, and present a rigorously designed set of guidelines and methodology for acquiring annotations in a cost-effective and reliable manner. We also present a methodology for using the annotations to compare solutions of two or more product category matching methods, including comparing models both before and after annotation. Three widely used e-commerce product category taxonomies, and multiple metrics, are used to demonstrate the utility of our proposals.

Introduction

The last decade has witnessed the rapid rise of e-commerce, including e-commerce marketplaces and platforms (such as eBay and Amazon) but also the adoption of e-commerce technologies by traditional retailers like Walmart and Target (Krishnamurthy, 2004, Chaffey, 2007, Hänninen et al, 2018, Mandel, 2017). In online marketplaces, e-commerce platforms, and even media relating thereof (e.g., product reviews and influencer blogs), product category matching between two independent webpages or platforms is a practical problem for users, advertisers and aggregators of information. While we define the problem formally in Section 2, Fig. 1(a) illustrates an intuitive example. One website (Walmart) may be talking about ‘runner rugs’, while another (Target) refers to the same concept as ‘runners’. The burden is on the user to find different mentions of the same product category through more intensive search (e.g., by posing different keywords in a search engine) or to be limited to the results that show up as relevant for a specific search phrase, even though a better product (described using a different phrase) or price may be available elsewhere.

In our own experience, we have found that there are several practical reasons why product category matching is an important problem, especially for media companies relying on advertising dollars. One reason is that when users are browsing media websites, including blog and product review sites, they may be exposed to a particular product that they would either like to immediately purchase, or research further for a future purchase. Linking the product category mentioned in the media post to retailers’ product webpages, many of whom may not refer to it in the same way (as illustrated above in the Walmart-Target example), is clearly valuable.

A key aspect of the problem is its domain-specific nature. In the e-commerce domain, product categories are arranged in a taxonomy, and along with the product category label, the ‘path’ leading to it from the root of the taxonomy is also an important structured attribute.1 A fragment of this taxonomy is illustrated in Fig. 1(b) for both websites. We formally define a taxonomy in Section 2. Because of this structure, classic ‘unstructured’ solutions such as ‘string matching’ were found to be too noisy to be useful even in preliminary experiments (Ukkonen, 1985, Navarro, 2001). Instead, we hypothesize that techniques that take both the label and the path into account may be more successful in determining when two concepts match, compared with solutions that rely only on the label. Through experimental results, we show that a method that takes the structure of the taxonomy into account when matching concepts between two taxonomies indeed performs better than one that only takes labels into account. Specific contributions are enumerated below:

  • First, we formalize and define the problem of product category matching, especially as it applies to the structured version of the problem that we intuitively described through the Walmart-Target example above.

  • Second, we present a rigorous set of guidelines for acquiring annotations for the product category matching problem in an efficient, cost-effective and reliable manner. Following an experimental study, we also present feedback from actual annotators that may allow further customization and task-specific refinement of these guidelines in other enterprises.

  • Third, we present a methodology for using the acquired annotations to evaluate two or more candidate solutions for product category matching. Within e-commerce, such evaluations have been conducted either behind closed doors, or using task-specific measures and datasets that may not have validity beyond that particular enterprise.2 In contrast, we present a clear and replicable description of our experimental findings and evaluation methodology.

  • Finally, using three widely used e-commerce product category taxonomies, we conduct an experimental study to evaluate two candidate solutions inspired by recent progress in representation learning. We also use the acquired annotations to evaluate models that may be developed and proposed after annotations have been collected.

Section snippets

Problem definition and research goals

The most fine-grained unit under consideration in this article is that of a concept. Concepts are fundamental components of ontologies (Fensel, 2001), and are equivalently described as types or collections of instances. However, because of the domain-specific nature of this article, we assume a less abstract definition of concepts as product categories, defined below.

Definition (Product Category): A product category is defined as an attribute of a product, representing its type. Every product

E-commerce taxonomies

We consider three taxonomies as the primary materials in this paper: Google Product Taxonomy (GPT), PriceGrabber and Walmart. Key statistics are provided in Table 1. The GPT is a list of thousands of product categories designed by Google to uniformly categorize products in a shopping feed. It is publicly available at the following link5 and has undergone some updates in recent years. We use the latest version for the experiments.

Annotation task construction and guidelines

In this section, we describe both the annotation guidelines, as well as how the pre-annotation models are used to generate a set of candidate concept-pairs that are then annotated using a 4-point scale (excellent, good, fair and bad). In Section 3.1 we mentioned that there are nine pairs of query-response datasets and two pre-annotation models: L1 (Pre.) and L2 (Retro.). In keeping with the terminology introduced in Section 2, we refer to the ‘source’ taxonomy from which the queries are issued

Findings

Using the proposed methodology, data and language representation models, we obtained a total of 4101 annotations from 25 unique editors, with each annotation comprising a ‘label’ from the 4-point scale expressed in the guidelines. For each label, we compute a standard set of statistics, expressing it as a box plot in Fig. 2. We find that the range broadens as the quality implied by the label worsens. As we will discuss shortly, the occurrence of ‘bad’ labels primarily stem from Pre. best-match

Annotator feedback and discussion

Earlier, we had detailed the annotation guidelines, and the methodology for constructing the nine annotation-tasks (each representing a ‘dataset pair’). Following the annotations, we sent a survey to each annotator to obtain valuable feedback on the task, since, as an annotation exercise, the task is relatively novel compared to other such exercises in the Web and AI literature (such as rating a webpage as relevant in response to a query).

In response to the post-annotation question, what did

Related work

The product category matching problem is related to several strands of research, as described below.

Product Recommendations and E-Commerce. The primary application domain in this paper was e-commerce. Recently, there has been an enormous growth in the e-commerce research literature in several computational communities (Park and Chu, 2009, Xiao and Benbasat, 2007, Goy et al, 2007, Ito et al., 2002). Unsurprisingly, the HCI community is no stranger to this domain, and even beyond the research

Conclusion and future work

Product category matching is an important problem that shows up in various guises in online marketplaces and digital commerce. In this paper, we described and formalized the problem, while presenting rigorous methodological solutions for addressing the problem in different contexts. We presented and described a key set of annotation guidelines both for acquiring annotations efficiently, and for using the annotations to conduct evaluations and analyses along multiple dimensions. Using three

Author statement

Nicolas Torzec: Conceptualization, Methodology, Supervision.

Chien-Chun Ni: Data curation, Methodology.

Ke Shen: Software, Visualization, Investigation, Validation, Writing – Reviewing and Editing.

Mayank Kejriwal: Writing – Original draft preparation, Conceptualization, Methodology, Supervision, Writing – Reviewing and Editing.

Conflicts of interest

The authors declare no conflicts of interest.

Declaration of Competing Interest

The authors report no declarations of interest.

References (59)

  • L. Chen et al.

    Recommender systems based on user reviews: the state of the art

    User Model. User-Adapt. Interact.

    (2015)
  • J. Devlin et al.

    Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding

    (2018)
  • X.L. Dong

    Challenges and innovations in building a product knowledge graph

    Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

    (2018)
  • M. Faruqui et al.

    Retrofitting Word Vectors to Semantic Lexicons

    (2014)
  • D. Fensel

    Ontologies

    Ontologies

    (2001)
  • T. Fountain et al.

    Taxonomy induction using hierarchical random graphs

    Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

    (2012)
  • A. Goy et al.

    Personalization in e-commerce applications

    The Adaptive Web

    (2007)
  • N.N. Group

    Ecommerce User Experience

    (2020)
  • A. Gupta et al.

    Taxonomy induction using hypernym subsequences

    Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

    (2017)
  • M. Hänninen et al.

    Digitalization in retailing: multi-sided platforms as drivers of industry transformation

    Balt. J. Manag.

    (2018)
  • D. Harman

    Information retrieval evaluation

    Synth. Lect. Inf. Concepts Retr. Serv.

    (2011)
  • J. Hollander, M. Schlesinger, Shared annotation system and method, US Patent App. 10/936,788 (May 24...
  • T. Ito et al.

    A group-buy protocol based on coalition formation for agent-mediated e-commerce

    IJCIS

    (2002)
  • A. Joulin et al.

    Bag of Tricks for Efficient Text Classification

    (2016)
  • N. Kim et al.

    A study on the law2vec model for searching related law

    J. Digit. Contents Soc.

    (2017)
  • Y.S. Kim

    Recommender system based on product taxonomy in e-commerce sites

    J. Inf. Sci. Eng.

    (2013)
  • B.P. Knijnenburg et al.

    Explaining the user experience of recommender systems

    User Model. User-Adapt. Interact.

    (2012)
  • S. Krishnamurthy

    A comparative analysis of ebay and amazon

    Intelligent Enterprises of the 21st Century

    (2004)
  • X.N. Lam et al.

    Addressing cold-start problem in recommendation systems

    Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication

    (2008)
  • Cited by (11)

    • E-commerce collaborative filtering recommendation method based on social network user relationship

      2023, International Journal of Networking and Virtual Organisations
    • Named entity resolution in personal knowledge graphs

      2023, Personal Knowledge Graphs (PKGs): Methodology, tools and applications
    • Automatic Semantic Typing of Pet E-commerce Products Using Crowdsourced Reviews: An Experimental Study

      2023, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus
    View full text