ABSTRACT
Similarity-search is an important problem to solve for the payment industry having user-merchant interaction data. It finds out merchants similar to a given merchant and solves various tasks like peer-set generation, recommendation, community detection, and anomaly detection. Recent works have shown that by leveraging interaction data, Graph Neural Networks (GNNs) can be used to generate node embeddings for entities like a merchant, which can be further used for such similarity-search tasks. However, most of the real-world financial data come with high cardinality categorical features such as city, industry, super-industries, etc. which are fed to the GNNs in a one-hot encoded manner. Current GNN algorithms are not designed to work for such sparse features which makes it difficult for them to learn these sparse features preserving embeddings. In this work, we propose CaPE, a Category Preserving Embedding generation method which preserves the high cardinality feature information in the embeddings. We have designed CaPE to preserve other important numerical feature information as well. We compare CaPE with the latest GNN algorithms for embedding generation methods to showcase its superiority in peer set generation tasks on real-world datasets, both external as well as internal (synthetically generated). We also compared our method for a downstream task like link prediction.
- Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321–357.Google ScholarCross Ref
- Belur V Dasarathy. 1991. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Tutorial(1991).Google Scholar
- Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems 29 (2016), 3844–3852.Google Scholar
- Kaize Ding, Yichuan Li, Jundong Li, Chenghao Liu, and Huan Liu. 2019. Feature interaction-aware graph neural networks. arXiv preprint arXiv:1908.07110(2019).Google Scholar
- Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 135–144.Google ScholarDigital Library
- David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. arXiv preprint arXiv:1509.09292(2015).Google Scholar
- Ming Gao, Leihui Chen, Xiangnan He, and Aoying Zhou. 2018. Bine: Bipartite network embedding. In The 41st international ACM SIGIR conference on research & development in information retrieval. 715–724.Google ScholarDigital Library
- Mihajlo Grbovic and Haibin Cheng. 2018. Real-time personalization using embeddings for search ranking at airbnb. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 311–320.Google ScholarDigital Library
- Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855–864.Google ScholarDigital Library
- William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025–1035.Google Scholar
- Chaoyang He, Tian Xie, Yu Rong, Wenbing Huang, Junzhou Huang, Xiang Ren, and Cyrus Shahabi. 2019. Cascade-BGNN: Toward Efficient Self-supervised Representation Learning on Large-scale Bipartite Graphs. arXiv preprint arXiv:1906.11994(2019).Google Scholar
- Vassilis N Ioannidis, Da Zheng, and George Karypis. 2020. PanRep: Universal node embeddings for heterogeneous graphs. (2020).Google Scholar
- Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with gpus. IEEE Transactions on Big Data(2019).Google ScholarCross Ref
- Anish Khazane, Jonathan Rider, Max Serpe, Antonia Gogoglou, Keegan Hines, C Bayan Bruss, and Richard Serpe. 2019. Deeptrax: Embedding graphs of financial transactions. In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 126–133.Google ScholarCross Ref
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907(2016).Google Scholar
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907(2016).Google Scholar
- Vineet Kosaraju, Amir Sadeghian, Roberto Martín-Martín, Ian Reid, S Hamid Rezatofighi, and Silvio Savarese. 2019. Social-bigat: Multimodal trajectory forecasting using bicycle-gan and graph attention networks. arXiv preprint arXiv:1907.03395(2019).Google Scholar
- Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-biggraph: A large-scale graph embedding system. arXiv preprint arXiv:1903.12287(2019).Google Scholar
- Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P Trevino, Jiliang Tang, and Huan Liu. 2017. Feature selection: A data perspective. ACM computing surveys (CSUR) 50, 6 (2017), 1–45.Google Scholar
- Cheng-Yuan Liou, Wei-Chen Cheng, Jiun-Wei Liou, and Daw-Ran Liou. 2014. Autoencoder for words. Neurocomputing 139(2014), 84–96.Google ScholarDigital Library
- Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory 28, 2 (1982), 129–137.Google ScholarDigital Library
- Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning convolutional neural networks for graphs. In International conference on machine learning. PMLR, 2014–2023.Google Scholar
- Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science 2, 11 (1901), 559–572.Google Scholar
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701–710.Google ScholarDigital Library
- Weiping Song, Zhiping Xiao, Yifan Wang, Laurent Charlin, Ming Zhang, and Jian Tang. 2019. Session-based social recommendation via dynamic graph attention networks. In Proceedings of the Twelfth ACM international conference on web search and data mining. 555–563.Google ScholarDigital Library
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web. 1067–1077.Google ScholarDigital Library
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903(2017).Google Scholar
- Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 839–848.Google ScholarDigital Library
- Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 974–983.Google ScholarDigital Library
Index Terms
- CaPE: Category Preserving Embeddings for Similarity-Search in Financial Graphs
Recommendations
Probabilistic embeddings of bounded genus graphs into planar graphs
SCG '07: Proceedings of the twenty-third annual symposium on Computational geometryA probabilistic C-embedding of a (guest) metric M into a collection of(host) metrics M'1, ..., M'k is a randomized mapping F of M intoone of the M'1, ..., M'k such that, for any two points p,q in theguest metric: The distance between F(p) and F(q) in ...
Nearest-neighbor-preserving embeddings
In this article we introduce the notion of nearest-neighbor-preserving embeddings. These are randomized embeddings between two metric spaces which preserve the (approximate) nearest-neighbors. We give two examples of such embeddings for Euclidean ...
Learning Backward Compatible Embeddings
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningEmbeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product ...
Comments