research-article

Improving Co-Cluster Quality with Application to Product Recommendations

Authors:

Michail Vlachos,

Francesco Fusco,

Charalambos Mavroforakis,

Anastasios Kyrillidis,

Vassilios G. VassiliadisAuthors Info & Claims

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 679 - 688

https://doi.org/10.1145/2661829.2661980

Published: 03 November 2014 Publication History

Abstract

Businesses store an ever increasing amount of historical customer sales data. Given the availability of such information, it is advantageous to analyze past sales, both for revealing dominant buying patterns, and for providing more targeted recommendations to clients. In this context, co-clustering has proved to be an important data-modeling primitive for revealing latent connections between two sets of entities, such as customers and products.

In this work, we introduce a new algorithm for co-clustering that is both scalable and highly resilient to noise. Our method is inspired by k-Means and agglomerative hierarchical clustering approaches: (i) first it searches for elementary co-clustering structures and (ii) then combines them into a better, more compact, solution. The algorithm is flexible as it does not require an explicit number of co-clusters as input, and is directly applicable on large data graphs. We apply our methodology on real sales data to analyze and visualize the connections between clients and products. We showcase a real deployment of the system, and how it has been used for driving a recommendation engine. Finally, we demonstrate that the new methodology can discover co-clusters of better quality and relevance than state-of-the-art co-clustering techniques.

References

[1]

A. Anagnostopoulos, A. Dasgupta, and R. Kumar. Approximation Algorithms for co-Clustering. In Proceedings of ACM Symposium on Principles of Database Systems (PODS), pages 201--210, 2008.

Digital Library

[2]

S. Arora, E. Hazan, and S. Kale. Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In Foundations of Computer Science (FOCS), pages 339--348, 2005.

Digital Library

[3]

D. Arthur and S. Vassilvitskii. k-means++: the advantages of careful seeding. In SODA, pages 1027--1035, 2007.

Digital Library

[4]

S. Bender-deMoll and D. McFarland. The art and science of dynamic network visualization. Social Struct 7:2, 2006.

[5]

H. Cho and I. S. Dhillon. Coclustering of human cancer microarrays using minimum sum-squared residue coclustering. IEEE/ACM Trans. Comput. Biology Bioinform., 5(3):385--400, 2008.

Digital Library

[6]

H. Cho, I. S. Dhillon, Y. Guan, and S. Sra. Minimum Sum-Squared Residue co-Clustering of Gene Expression Data. In Proc. of SIAM Conference on Data Mining (SDM), 2004.

[7]

I. S. Dhillon. Co-Clustering Documents and Words using Bipartite Spectral Graph Partitioning. In Proc. of KDD, pages 269--274, 2001.

Digital Library

[8]

I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In Proc. of KDD, pages 89--98, 2003.

Digital Library

[9]

M. Eisen, P. Spellman, P. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proc. of the National Academy of Science of the United States, 95(25), pages 14863--14868, 1998.

[10]

D. Eppstein. Fast hierarchical clustering and other applications of dynamic closest pairs. ACM Journal of Experimental Algorithmics, 5:1, 2000.

Digital Library

[11]

J. A. Hartigan. Direct Clustering of a Data Matrix. J. Am. Statistical Assoc., 67(337):123--129, 1972.

[12]

J. Kuczynski and H. Wozniakowski. Estimating the largest eigenvalue by the power and lanczos algorithms with a random start. SIAM J. Matrix Analysis and Applications, 13(4):1094--1122, 1992.

Digital Library

[13]

Y. Li, K. Zhao, X. Chu, and J. Liu. Speeding up k-Means algorithm by GPUs. J. Comput. Syst. Sci., 79(2):216--229, 2013.

Digital Library

[14]

S. Madeira and A. L. Oliveira. Biclustering Algorithms for Biological Data Analysis: a survey. Trans. on Comp. Biology and Bioinformatics, 1(1):24--45, 2004.

Digital Library

[15]

M. Rege, M. Dong, and F. Fotouhi. Bipartite isoperimetric graph partitioning for data co-clustering. Data Min. Knowl. Discov., 16(3):276--312, 2008.

Digital Library

[16]

H.-J. Schulz, M. John, A. Unger, and H. Schumann. Visual analysis of bipartite biological networks. Eurographics Workshop on Visual Computing for Biomedicine, pages 135--142, 2008.

Digital Library

[17]

J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu. GraphScope: Parameter-free Mining of Large Time-evolving Graphs. In Proc. of KDD, pages 687--696, 2007.

Digital Library

[18]

A. Tanay, R. Sharan, and R. Shamir. Biclustering Algorithms: a survey. Handbook of Computational Molecular Biology, 2004.

[19]

M. Zechner and M. Granitzer. Accelerating k-means on the graphics processor via cuda. In Proc. of Int. Conf. on Intensive Applications and Services, pages 7--15, 2009.

Digital Library

Cited By

Melchiorre ARekabsaz NGanhör CSchedl M(2022)ProtoMF: Prototype-based Matrix Factorization for Effective and Explainable RecommendationsProceedings of the 16th ACM Conference on Recommender Systems10.1145/3523227.3546756(246-256)Online publication date: 12-Sep-2022
https://dl.acm.org/doi/10.1145/3523227.3546756
Zhu YLi BSegarra S(2021)Co-clustering Vertices and Hyperedges via Spectral Hypergraph Partitioning2021 29th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO54536.2021.9616223(1416-1420)Online publication date: 23-Aug-2021
https://doi.org/10.23919/EUSIPCO54536.2021.9616223
Ali WKumar RDeng ZWang YShao J(2021)A Federated Learning Approach for Privacy Protection in Context-Aware Recommender SystemsThe Computer Journal10.1093/comjnl/bxab02564:7(1016-1027)Online publication date: 30-Apr-2021
https://doi.org/10.1093/comjnl/bxab025
Show More Cited By

Index Terms

Improving Co-Cluster Quality with Application to Product Recommendations
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion
Special Section on Visual Understanding with RGB-D Sensors

Collaborative Filtering (CF) is one of the most successful algorithms in recommender systems. However, it suffers from data sparsity and scalability problems. Although many clustering techniques have been incorporated to alleviate these two problems, ...
Improving Prediction Quality in Collaborative Filtering Based on Clustering
WI-IAT '08: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

In this paper we present the recommender systems that use the k-means clustering method in order to solve the problems associated with neighbor selection. The first method is to solve the problem in which customers belong to different clusters due to ...
Integrating collaborative filtering and matching-based search for product recommendations

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and, thus, help in making good ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

November 2014

2152 pages

ISBN:9781450325981

DOI:10.1145/2661829

General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Seventh Framework Programme

Conference

CIKM '14

Sponsor:

CIKM '14: 2014 ACM Conference on Information and Knowledge Management

November 3 - 7, 2014

Shanghai, China

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
342
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Melchiorre ARekabsaz NGanhör CSchedl M(2022)ProtoMF: Prototype-based Matrix Factorization for Effective and Explainable RecommendationsProceedings of the 16th ACM Conference on Recommender Systems10.1145/3523227.3546756(246-256)Online publication date: 12-Sep-2022
https://dl.acm.org/doi/10.1145/3523227.3546756
Zhu YLi BSegarra S(2021)Co-clustering Vertices and Hyperedges via Spectral Hypergraph Partitioning2021 29th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO54536.2021.9616223(1416-1420)Online publication date: 23-Aug-2021
https://doi.org/10.23919/EUSIPCO54536.2021.9616223
Ali WKumar RDeng ZWang YShao J(2021)A Federated Learning Approach for Privacy Protection in Context-Aware Recommender SystemsThe Computer Journal10.1093/comjnl/bxab02564:7(1016-1027)Online publication date: 30-Apr-2021
https://doi.org/10.1093/comjnl/bxab025
Costa AD'Addio RFressato EManzato Mdos Santos JMuchaluat Saade Dda Graça C. Pimentel MMacedo A(2019)A personalized clustering-based approach using open linked data for search space reduction in recommender systemsProceedings of the 25th Brazillian Symposium on Multimedia and the Web10.1145/3323503.3349543(409-416)Online publication date: 29-Oct-2019
https://dl.acm.org/doi/10.1145/3323503.3349543
Vlachos MDunner CHeckel RVassiliadis VParnell TAtasu K(2019)Addressing Interpretability and Cold-Start in Matrix Factorization for Recommender SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.282952131:7(1253-1266)Online publication date: 1-Jul-2019
https://doi.org/10.1109/TKDE.2018.2829521
Heckel RVlachos MParnell TDuenner C(2017)Scalable and Interpretable Product Recommendations via Overlapping Co-Clustering2017 IEEE 33rd International Conference on Data Engineering (ICDE)10.1109/ICDE.2017.149(1033-1044)Online publication date: Apr-2017
https://doi.org/10.1109/ICDE.2017.149
Nasiri MMinaei BKiani A(2016)Dynamic Recommendation: Disease Prediction and Prevention Using Recommender SystemInternational journal of basic science in medicine10.15171/ijbsm.2016.041:1(13-17)Online publication date: 29-Jun-2016
https://doi.org/10.15171/ijbsm.2016.04
Araujo MRibeiro PFaloutsos C(2016)FastStep: Scalable Boolean Matrix DecompositionAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-31753-3_37(461-473)Online publication date: 12-Apr-2016
https://doi.org/10.1007/978-3-319-31753-3_37
Nasiri MSharifi ZMinaei B(2015)Alleviate sparsity problem using hybrid model based on spectral co-clustering and tensor factorization2015 5th International Conference on Computer and Knowledge Engineering (ICCKE)10.1109/ICCKE.2015.7365843(285-289)Online publication date: Oct-2015
https://doi.org/10.1109/ICCKE.2015.7365843

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten