Abstract
Can a deep neural network memorize a database? Though deep artificial neural networks are remarkable for large memory capacity that makes fitting any dataset possible, memorizing a database is a novel learning task unlike other popular tasks which intrinsically model mappings rather than “memorize” information internally. We give a positive answer to the question by showing that through training with maximal/minimal and frequent/infrequent patterns of a transactional database, a dynamically constructed deep net can support random itemset support queries with relatively high precision in regard to data compression ratio. Due to the compressive memorization, the amount of transactions in the database becomes irrelevant to the query time cost in our efficient method. We further discuss the potential interpretation of learnt database representation by analyzing corresponding statistical features of the database and activation patterns of the neural network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Zero-support (1-infrequent) itemsets are truncated due to excessive cardinality.
References
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: 5th International Conference on Learning Representations, ICLR, Toulon, France (2017)
Krueger, D., Ballas, N., Jastrzebski, S., Arpit1, D., Kanwal, M.S., Maharaj, T., Bengio, E., Fischer, A., Courville, A.: Deep nets don’t learn via memorization. In: Workshop track of the 5th International Conference on Learning Representations, ICLR, Toulon, France (2017)
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approximate Reasoning 50(7), 969–978 (2009)
Norouzi, M., Fleet, D.: Minimal loss hashing for compact binary codes. In: 28th International Conference in Machine Learning, ICML, Washington (2011)
Boulicaut, J.F., Bykowski, A., Rigotti, C.: Approximation of frequency queris by means of free-sets. In: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Freiburg, Germany, pp. 75–85 (2000)
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering, pp. 443–452. IEEE Press, Washington (2001)
Cagliero, L., Garza, P.: Infrequent weighted itemset mining using frequent pattern growth. IEEE Trans. Knowl. Data Eng. 26(4), 903–915 (2014)
Mundra, A., Singh, A., Tomar, P.: Incremental frequent pattern mining: a recent review. Int. J. Eng. Res. Technol. 1(8) (2012)
Dong, W., Jiang, H., Chen, L., Liu, G.: Incremental updating algorithm for infrequent itemsets on weighted condition. In: International Conference on Computer Design and Applications, ICCDA, Qinhuangdao, China (2010)
Lei, J.: Dynamic structure neural network for stable adaptive control of non-linear systems. IEEE Trans. Neural Networks 7(5), 1151–1167 (1996)
Dong, Y., Su, H., Zhu, J., Zhang, B.: Improving Interpretability of Deep Neural Networks with Semantic Information. CoRR, http://arxiv.org/abs/1703.04096 (2017)
Tan, S., Sim, K., Gales, M.: Improving the interpretability of deep neural networks with stimulated learning. In: IEEE Workshop on Automatic Speech Recognition and Understanding, Scottsdale, USA (2015)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Cooper, C., Zito, M.: Realistic synthetic data for testing association rule mining algorithms for market basket databases. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS, vol. 4702, pp. 398–405. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74976-9_39
Ohsawa, Y., Kido, H., Hayashi, T., Liu, C.: Data jackets for synthesizing values in the market of data. Procedia Comput. Sci. 22(1), 709–716 (2013)
IBM: Quest Synthetic Data Generator (2009). http://www.almaden.ibm.com
Acknowledgments
This work was supported by JST CREST Grant Number JPMJCR1304, JSPS KAKENHI Grant Numbers JP16H01836, and JP16K12428.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ji, Y., Ohsawa, Y. (2017). Memorizing Transactional Databases Compressively in Deep Neural Networks for Efficient Itemset Support Queries. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-70096-0_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)