research-article

Design Domain Specific Neural Network via Symbolic Testing

Authors:

Yuan QiAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3219 - 3229

https://doi.org/10.1145/3534678.3539118

Published: 14 August 2022 Publication History

Abstract

Deep sequence networks such as multi-head self-attention networks provide a promising way to extract effective representations from raw sequence data in an end-to-end fashion and have shown great success in various domains such as natural language processing, computer vision, $etc$. However, in domains such as financial risk management and anti-fraud where expert-derived features are heavily relied on, deep sequence models struggle to dominate the game.In this paper, we introduce a simple framework called symbolic testing to verify the learnability of certain expert-derived features over sequence data. A systematic investigation over simulated data reveals the fact that the self-attention architecture fails to learn some standard symbolic expressions like the count distinct operation. To overcome this deficiency, we propose a novel architecture named SHORING, which contains two components:event network andsequence network. Theevent network efficiently learns arbitrary high-orderevent-level conditional embeddings via a reparameterization trick while thesequence network integrates domain-specific aggregations into the sequence-level representation, thereby providing richer inductive biases compare to standard sequence architectures like self-attention. We conduct comprehensive experiments and ablation studies on synthetic datasets that mimic sequence data commonly seen in anti-fraud domain and three real-world datasets. The results show that SHORING learns commonly used symbolic features well, and experimentally outperforms the state-of-the-art methods by a significant margin over real-world online transaction datasets. The symbolic testing framework and SHORING have been applied in anti-fraud model development at Alipay and improved performance of models for real-time fraud-detection.

Supplemental Material

MP4 File

Presentation video of the paper "Design Domain Specific Neural Network via Symbolic Testing"

Download
106.45 MB

References

[1]

Blondel, M., Fujino, A., Ueda, N., and Ishihata, M. Higher-order factorization machines. NeurIPS, 29:3351--3359, 2016.

[2]

Blondel, M., Ishihata, M., Fujino, A., and Ueda, N. Polynomial networks and factorization machines: New insights and efficient training algorithms. ICML, 2016.

Digital Library

[3]

Chen, X., Li, S., Li, H., Jiang, S., and Song, L. Neural model-based reinforcement learning for recommendation. 2018.

[4]

Chen, X., Li, S., Li, H., Jiang, S., Qi, Y., and Song, L. Generative adversarial user model for reinforcement learning based recommendation system. In ICML, 2019.

[5]

Cheng, H.-T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., Anderson, G., Corrado, G., Chai, W., Ispir, M., et al. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems, pp. 7--10, 2016.

Digital Library

[6]

Cheng, W., Shen, Y., and Huang, L. Adaptive factorization network: Learning adaptive-order feature interactions. In AAAI, 2020.

[7]

Choromanski, K., Likhosherstov, V., Dohan, D., Song, X., Gane, A., Sarlos, T., Hawkins, P., Davis, J., Mohiuddin, A., Kaiser, L., et al. Rethinking attention with performers. arXiv, 2020.

[8]

Dai, H., Li, H., Tian, T., Huang, X., Wang, L., Zhu, J., and Song, L. Adversarial attack on graph structured data. In International conference on machine learning (ICML), pp. 1115--1124. PMLR, 2018.

[9]

Das, D., Sahoo, L., and Datta, S. A survey on recommendation system. International Journal of Computer Applications, 160(7), 2017.

[10]

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv, 2018.

[11]

Feng, Y., Lv, F., Shen, W., Wang, M., Sun, F., Zhu, Y., and Yang, K. Deep session interest network for click-through rate prediction. arXiv, 2019.

[12]

Garcez, A., Besold, T. R., Raedt, L., Foldiak, P., Hitzler, P., Icard, T., Kuhnberger, K.-U., Lamb, L. C., Miikkulainen, R., and Silver, D. L. Neural-symbolic learning and reasoning: contributions and challenges. 2015.

[13]

Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y. Deep learning, volume 1. MIT press Cambridge, 2016.

Digital Library

[14]

Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., and Smola, A. A kernel two-sample test. JMLR, 2012.

Digital Library

[15]

Guo, H., Tang, R., Ye, Y., Li, Z., and He, X. Deepfm: a factorization-machine based neural network for ctr prediction. arXiv, 2017.

[16]

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In CVPR, 2016.

[17]

He, X. and Chua, T.-S. Neural factorization machines for sparse predictive analytics. In SIGIR, 2017.

Digital Library

[18]

Hines, J.W. A logarithmic neural network architecture for unbounded non-linear function approximation. In ICNN, 1996.

[19]

Hochreiter, S. and Schmidhuber, J. Long short-term memory. Number 8, pp. 1735--1780. Neural computation, 1997.

[20]

Jannach, D., de Souza P. Moreira, G., and Oldridge, E. Why are deep learning models not consistently winning recommender systems competitions yet? a position paper. In Proceedings of the Recommender Systems Challenge 2020, pp. 44--49. 2020.

Digital Library

[21]

Kim, H., Papamakarios, G., and Mnih, A. The lipschitz constant of self-attention. In International Conference on Machine Learning, pp. 5562--5571. PMLR, 2021.

[22]

Li, H. and Chen, Y. Q. Automatic 3d reconstruction of mitochondrion with local intensity distribution signature and shape feature. In ICIP, 2013.

[23]

Li, H., Liu, Y., and Chen, Y. Q. Automatic trajectory measurement of large numbers of crowded objects. Optical Engineering, 52(6):067003, 2013.

[24]

Li, H., Hu, K., Zhang, S., Qi, Y., and Song, L. Double neural counterfactual regret minimization. In ICLR, 2019.

[25]

Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., and Sun, G. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In SIGKDD, 2018.

Digital Library

[26]

Potdar, K., Pardawala, T. S., and Pai, C. D. A comparative study of categorical variable encoding techniques for neural network classifiers. International journal of computer applications, 2017.

[27]

Qu, C., Li, H., Liu, C., Xiong, J., Zhang, J., Chu, W., Wang, W., Qi, Y., and Song, L. Intention propagation for multi-agent reinforcement learning. arXiv preprint arXiv:2004.08883, 2020.

[28]

Qu, Y., Cai, H., Ren, K., Zhang, W., Yu, Y., Wen, Y., and Wang, J. Product-based neural networks for user response prediction. In ICDM, 2016.

[29]

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I. Language models are unsupervised multitask learners. OpenAI blog, 2019.

[30]

Rendle, S. Factorization machines. In ICDM, 2010.

Digital Library

[31]

Saxe, A. M., McClelland, J. L., and Ganguli, S. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv, 2013.

[32]

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Driessche, G. V. D., and et al., J. S. Mastering the game of Go with deep neural networks and tree search. Nature, (7587), 2016.

[33]

Song, W., Shi, C., Xiao, Z., Duan, Z., Xu, Y., Zhang, M., and Tang, J. Autoint: Automatic feature interaction learning via self-attentive neural networks. In CIKM, 2019.

Digital Library

[34]

Sun, H., Chen, W., Li, H., and Song, L. Improving learning to branch via reinforcement learning. Neural Information Processing Systems (NeurIPS) on Learning Meets Combinatorial Algorithms, 2020.

[35]

Sutton, R. S., Barto, A. G., et al. Introduction to reinforcement learning, volume 135. MIT press Cambridge, 1998.

Digital Library

[36]

Tay, Y., Dehghani, M., Abnar, S., Shen, Y., Bahri, D., Pham, P., Rao, J., Yang, L., Ruder, S., and Metzler, D. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020.

[37]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. In NeurIPS, 2017.

Digital Library

[38]

Wang, L., Chang, X., Li, S., Chu, Y., Li, H., Zhang, W., He, X., Song, L., Zhou, J., and Yang, H. Tcl: Transformer-based dynamic graph modelling via contrastive learning. arXiv preprint arXiv:2105.07944, 2021.

[39]

Wang, R., Fu, B., Fu, G., and Wang, M. Deep & cross network for ad click predictions. In ADKDD. 2017.

Digital Library

[40]

Xi, D., Zhuang, F., Song, B., Zhu, Y., Chen, S., Hong, D., Chen, T., Gu, X., and He, Q. Neural hierarchical factorization machines for user's event sequence analysis. In SIGIR, 2020.

Digital Library

[41]

Xiao, J., Ye, H., He, X., Zhang, H.,Wu, F., and Chua, T.-S. Attentional factorization machines: Learning the weight of feature interactions via attention networks. arXiv, 2017.

[42]

Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=ryGs6iA5Km.

[43]

Zhang, W., Du, T., and Wang, J. Deep learning over multi-field categorical data. In European conference on information retrieval, 2016.

[44]

Zhao, J., Li, H., Duan, M., Wang, S. H., and Chen, Y. Q. Rapid identification of neuronal structures in electronic microscope image using novel combined multi-scale image features. Neurocomputing, 230:152--159, 2017.

Digital Library

[45]

Zhou, G., Zhu, X., Song, C., Fan, Y., Zhu, H., Ma, X., Yan, Y., Jin, J., Li, H., and Gai, K. Deep interest network for click-through rate prediction. In SIGKDD, 2018.

Digital Library

[46]

Zhu, Y., Xi, D., Song, B., Zhuang, F., Chen, S., Gu, X., and He, Q. Modeling users' behavior sequences with hierarchical explainable network for cross-domain fraud detection. In Proceedings of The Web Conference, 2020.

Digital Library

Index Terms

Design Domain Specific Neural Network via Symbolic Testing

Recommendations

Cross-domain 3D model retrieval via visual domain adaptation
IJCAI'18: Proceedings of the 27th International Joint Conference on Artificial Intelligence

Recent advances in 3D capturing devices and 3D modeling software have led to extensive and diverse 3D datasets, which usually have different distributions. Cross-domain 3D model retrieval is becoming an important but challenging task. However, existing ...
Regularizing self-attention on vision transformers with 2D spatial distance loss
Abstract
Recently, the vision transformer (ViT) achieved remarkable results on computer vision-related tasks. However, ViT lacks the inductive biases present on CNNs, such as locality and translation equivariance. Overcoming this deficiency usually comes ...
Domain generalization by learning and removing domain-specific features
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

Deep Neural Networks (DNNs) suffer from domain shift when the test dataset follows a distribution different from the training dataset. Domain generalization aims to tackle this issue by learning a model that can generalize to unseen domains. In this ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
179
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)9

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten