research-article

Transferring Learnt Features from Deep Neural Networks trained on Structured Data

Authors:
Manikya Bardhan

Computer Science and Engineering, PES University, India

Computer Science and Engineering, PES University, India
View Profile

,
Joe Rishon Manoj

Computer Science and Engineering, PES University, India

Computer Science and Engineering, PES University, India
View Profile

,
Rakshith Raveendra Acharya

Computer Science and Engineering, PES University, India

Computer Science and Engineering, PES University, India
View Profile

,
Ishita Datta

Computer Science and Engineering, PES University, India

Computer Science and Engineering, PES University, India
View Profile

,
Arti Arya

Computer Science and Engineering, PES University, India

Computer Science and Engineering, PES University, India
View Profile

ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning TechnologiesMarch 2022Pages 62–67https://doi.org/10.1145/3529399.3529410

Published:10 June 2022Publication History

ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning Technologies

Pages 62–67

ABSTRACT

Structured data is a widely used type of data with numerous applications in training machine learning models. However, training deep learning models require a lot of data, which may not be present for all use-cases. In addition to this, training these models could get very expensive as the data increases. Transfer learning can be a solution to these problems. It involves reusing features from trained models on the same or similar tasks, however, it has not been explored much for structured data yet. In this paper, an approach is proposed to transfer learnt features from the embedding layers present commonly in deep neural networks for structured data along with a format for effective portability of these trained embeddings. Experimentally, it is observed that the proposed method resulted in faster training and the model parameters start at a better point compared to parameters of a randomly initialized model, resulting in lesser training costs as well.

References

Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, and Chunfang Liu. 2018. A survey on deep transfer learning. (2018). arXiv: 1808.01974 [cs.LG].Google Scholar
Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2020. A comprehensive survey on transfer learning. (2020). arXiv: 1911.02685 [cs.LG].Google Scholar
Shuteng Niu, Yongxin Liu, Jian Wang, and Houbing Song. 2020. A decade survey of transfer learning (2010–2020). IEEE Transactions on Artificial Intelligence, 1, 151–166.Google ScholarCross Ref
Nikola Milosevic, Cassie Gregson, Robert Hernandez, and Goran Nenadic. 2016. Disentangling the structure of tables in scientific literature. In Natural Language Processing and Information Systems. Elisabeth Métais, Farid Meziane, Mohamad Saraee, Vijayan Sugumaran, and Sunil Vadera, editors. Springer International Publishing, Cham, 162–174. isbn: 978-3-319-41754-7.Google Scholar
Yeliz Yesilada, Robert Stevens, Carole Goble, and Shazad Hussein. 2003. Rendering tables in audio: the interaction of structure and reading styles. SIGACCESS Access. Comput., 77–78, (September 2003), 16–23. issn: 1558-2337. doi: 10.1145/1029014.1028635. https://doi.org/10.1145/1029014. 1028635.Google ScholarDigital Library
Chris Brink, Wolfram Kahl, and Gunther Schmidt, editors. 1997. Tabular representations in relational documents. Relational Methods in Computer Science. Springer Vienna, Vienna, 184–196. isbn: 978-3-7091-6510-2. doi: 10.1007/978-3-7091-6510-2_12. https://doi.org/10.1007/978-3-7091-6510- 2_12.Google Scholar
Jui-Ting Huang, Jinyu Li, Dong Yu, Li Deng, and Yifan Gong. 2013. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). (May 2013). https://www.microsoft. com/en- us/research/publication/cross- language- knowledge- transfer- using- multilingual- deep- neural- network- with- shared- hidden- layers/.Google ScholarCross Ref
Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and transferring mid-level image representations using convolutional neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, 1717–1724. doi: 10.1109/CVPR.2014.222.Google ScholarDigital Library
Behnam Neyshabur, Hanie Sedghi, and Chiyuan Zhang. 2021. What is being transferred in transfer learning? (2021). arXiv: 2008.11687 [cs.LG].Google Scholar
Akram Farhadi, David Chen, Rozalina McCoy, Christopher Scott, John A. Miller, Celine M. Vachon, and Che Ngufor. 2019. Breast cancer classification using deep transfer learning on structured healthcare data. English (US). In Proceedings - 2019 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2019 (Proceedings - 2019 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2019). Lisa Singh, Richard De Veaux, George Karypis, Francesco Bonchi, and Jennifer Hill, editors. 6th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2019 ; Conference date: 05-10-2019 Through 08-10-2019. Institute of Electrical and Electronics Engineers Inc., (October 2019), 277–286. doi: 10.1109/DSAA.2019.00043.Google Scholar
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: a highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems. I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors. Volume 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf.Google Scholar
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: a scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). Association for Computing Machinery, San Francisco, California, USA, 785–794. isbn: 9781450342322. doi: 10.1145/2939672.2939785. https://doi.org/10.1145/2939672.2939785.Google ScholarDigital Library
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. Catboost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems. S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors. Volume 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf.Google ScholarDigital Library
Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko. 2021. Revisiting deep learning models for tabular data. (2021). arXiv: 2106.11959 [cs.LG].Google Scholar
Cheng Guo and Felix Berkhahn. 2016. Entity embeddings of categorical variables. (2016). arXiv: 1604.06737 [cs.LG].Google Scholar
Benjamin Ghaemmaghami, Zihao Deng, Benjamin Cho, Leo Orshansky, Ashish Kumar Singh, Mattan Erez, and Michael Orshansky. 2020. Training with multi-layer embeddings for model reduction. (2020). arXiv: 2006.05623 [cs.LG].Google Scholar
Yixuan Ma and Zhenji Zhang. 2020. Travel mode choice prediction using deep neural networks with entity embeddings. IEEE Access, 8, 64959–64970.Google ScholarCross Ref
S. Ö. Arik and T. Pfister. 2021. Tabnet: attentive interpretable tabular learning. AAAI, vol. 35, no. 8, (May 2021), 6679–6687.Google Scholar
Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C. Bayan Bruss, and Tom Goldstein. 2021. Saint: improved neural networks for tabular data via row attention and contrastive pre-training. (2021). arXiv: 2106.01342 [cs.LG].Google Scholar
Xin Huang, Ashish Khetan, Milan Cvitkovic, and Zohar Karnin. 2020. Tabtransformer: tabular data modeling using contextual embeddings. (2020). arXiv: 2012.06678 [cs.LG].Google Scholar
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors. Volume 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf.Google Scholar
Dheeru Dua and Casey Graff. 2017. UCI machine learning repository. (2017). http://archive.ics.uci.edu/ml.Google Scholar
Ronny Kohavi and Barry Becker. 1996. Uci machine learning repository. archive.ics.uci.edu/ml/datasets/census+income.Google Scholar
Ronny Kohavi and Barry Becker. 1996. Uci machine learning repository. https://archive.ics.uci.edu/ml/datasets/adult.Google Scholar
Sérgio Moro, Paulo Cortez, and Paulo Rita. 2014. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62, 22–31. issn: 0167-9236. doi: https://doi.org/10.1016/j.dss.2014.03.001. https://www.sciencedirect.com/science/article/pii/S016792361400061X.Google ScholarCross Ref
Mayuri Jha and Ayushman. 2019. Propensity to fund mortgages. crowdanalytix.com/contests/propensity-to-fund-mortgages.Google Scholar
Nishant Bhavsar. 2019. Propensity to fund mortgages. github.com/NishantBhavsar/propensity_to_fund_mortgages.Google Scholar
Jeremy Howard and Sylvain Gugger. 2020. Fastai: a layered api for deep learning. Information, 11, 2. issn: 2078-2489. doi: 10.3390/info11020108. https://www.mdpi.com/2078- 2489/11/2/108.Google ScholarCross Ref
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research). Francis Bach and David Blei, editors. Volume 37. PMLR, Lille, France, (July 2015), 448–456. https://proceedings.mlr.press/v37/ioffe15.html.Google Scholar
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 56, 1929–1958. http://jmlr.org/papers/v15/srivastava14a.html.Google ScholarDigital Library
Lukas Biewald. 2020. Experiment tracking with weights and biases. Software available from wandb.com. (2020). https://www.wandb.com/.Google Scholar
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114, 13, 3521–3526. issn: 0027-8424. doi: 10.1073/pnas.1611835114. eprint: https://www.pnas.org/content/114/13/3521.full.pdf. https://www.pnas.org/content/114/13/3521.Google ScholarCross Ref
Leslie N. Smith. 2018. A disciplined approach to neural network hyper-parameters: part 1 – learning rate, batch size, momentum, and weight decay. (2018). arXiv: 1803.09820 [cs.LG].Google Scholar

Index Terms

Transferring Learnt Features from Deep Neural Networks trained on Structured Data

Index terms have been assigned to the content through auto-classification.

Recommendations

Revisiting multiple instance neural networks

We revisit the problem of solving MIL using neural networks (MINNs), which are ignored in current MIL research community. Our experiments show that MINNs are very effective and efficient.We proposed a novel MI-Net which is centered on learning bag ...
Read More
Genetically-trained deep neural networks
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Deep learning is a widely explored research area, as it established the state of the art in many fields. However, the effectiveness of deep neural networks (DNNs) is affected by several factors related with their training. The commonly used gradient-...
Read More
A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons

Single-hidden layer feedforward neural networks with randomly fixed hidden neurons (RHN-SLFNs) have been shown, both theoretically and experimentally, to be fast and accurate. Besides, it is well known that deep architectures can find higher-level ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning Technologies
March 2022
291 pages
ISBN:9781450395748
DOI:10.1145/3529399

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
embedding
tabular data
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 45
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Transferring Learnt Features from Deep Neural Networks trained on Structured Data

ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting multiple instance neural networks

Genetically-trained deep neural networks

A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Transferring Learnt Features from Deep Neural Networks trained on Structured Data

ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning Technologies

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting multiple instance neural networks

Genetically-trained deep neural networks

A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media