ABSTRACT
In e-commerce, product classification is widely used for various purposes. Misclassifying products can cause compliance issues and hurt the company's reputation. To address this problem, we propose an automated system to proactively detect product misclassifications by overcoming several challenges. A large e-commerce retailer can sell billions of distinct products, on which many thousands of classification tasks are performed. At this massive scale, we need to quickly detect misclassifications under a limited budget. In this talk, we point out these challenges and show how we design our system to handle them. When evaluated on a set of Amazon's product classification data, at an overhead of <10% of the classification cost, our system automatically identified and corrected many misclassifications, which would take a human many thousand years to manually find and 14.6 years to manually review and correct if our system were not used.
- Heinrich Jiang, Been Kim, Melody Guan, and Maya Gupta. 2018. To trust or not to trust a classifier. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS'18). 5546--5557.Google Scholar
- Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, and Inderjit S. Dhillon. 2022. PECOS: Prediction for enormous and correlated output spaces. Journal of Machine Learning Research, 23(98):1--32.Google Scholar
Index Terms
- Proactive and Automatic Detection of Product Misclassifications at Massive Scale
Recommendations
A Comparative Study of Repeat Buyer Prediction: Kaggle Acquired Value Shopper Case Study
ICISS '19: Proceedings of the 2nd International Conference on Information Science and SystemsMany consumer brands try their best to offer promotions that attract new customers with that hope the customer will remain loyal to the brand and come back to buy more. However, only a fraction of customers who use these promotions actually remained ...
E-Commerce Merchant Classification using Website Information
WIMS2019: Proceedings of the 9th International Conference on Web Intelligence, Mining and SemanticsWith the rapid growth of the e-commerce landscape, classifying e-commerce merchants has become an important task as it is an integral part of various processes in e-commerce. One of the examples is merchant on boarding, where the category of an e-...
A fast hybrid classification algorithm based on the minimum distance and the k-NN classifiers
SISAP '11: Proceedings of the Fourth International Conference on SImilarity Search and APplicationsSome of the most commonly used classifiers are based on the retrieval and examination of the k Nearest Neighbors of unclassified instances. However, since the size of datasets can be large, these classifiers are inapplicable when the time-costly ...
Comments