skip to main content
10.1145/3383923.3383942acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiceitConference Proceedingsconference-collections
research-article

Automated Machine Learning with Genetic Programming on Real Dataset of Tax Avoidance Classification Problem

Published:23 April 2020Publication History

ABSTRACT

Dealing with real application datasets often derive a stumbling block for machine learning algorithms to produce good results in solving either prediction or classification problems. Imbalance dataset is the major reason for this problem associated with missing values, small dimension of data size and very skewed data distribution. This paper demonstrates an empirical study that used Automated Machine Learning (AML) based on Genetic Programming (GP) named as AML TPOT. This is a very recent AML developed as an open source Python library and reported as a promising model by a few of researchers who have tested the algorithm. Nevertheless, most of the works on the AML TPOT were conducted on a set of common or benchmark datasets for machine learning testing. In this paper, the focus is on real and deviant dataset, which were collected according to the tax avoidance of the Government-Link Company in Malaysia. Comparison of the AML performances that tested on the dataset with different GP parameters setting is provided. Thus, this paper provides a fundamental knowledge on the experimental design and finding that will be useful for the AML based GP future improvement.

References

  1. Olson, R. S., Bartley, N., Urbanowicz, R. J., & Moore, J. H. 2016. Evaluation of a tree-based pipeline optimization tool for automating data science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016. 485--492.Google ScholarGoogle Scholar
  2. Gijsbers, P., Vanschoren, J. and Olson, R. S. 2017. Layered TPOT: Speeding up tree-based pipeline optimization. in CEUR Workshop Proceedings, vol. 1998.Google ScholarGoogle Scholar
  3. Suganuma, M., Shirakawa, S. and Nagao, T. 2017. A Genetic Programming Approach to Designing Convolutional Neural Network Architectures, in Proceedings of the Genetic and Evolutionary Computation Conference, 497--504.Google ScholarGoogle Scholar
  4. Stanley, K. O. and Miikkulainen, R. 2002. Evolving neural networks through augmenting topologies. Evol. Comput., 10, 2, 99--127.Google ScholarGoogle Scholar
  5. Liu, L. and Shao, L. 2013, Learning discriminative representations from RGB-D video data. in Proceedings of Twenty-Third International Joint Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  6. Shao, L., Liu, L. and Li, X. 2014. Feature Learning for Image Classification Via Multiobjective Genetic Programming. IEEE Trans. Neural Networks Learn. Syst., 25, 7, 1359--1371.Google ScholarGoogle Scholar
  7. Kotthoff, L., Thornton, C., Hoos, H. H., Hutter, F. and Leyton-Brown, K., 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal Mach. Learn. Res., 18, 1, 826--830.Google ScholarGoogle Scholar
  8. Thornton, C., Hutter, F., Hoos, H. H. and Leyton-Brown, K. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 847--855.Google ScholarGoogle Scholar
  9. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M. and Hutter, F. 2015. Efficient and robust automated machine learning. in Advances in neural information processing systems, 2962--2970.Google ScholarGoogle Scholar
  10. Brochu, E., Cora, V.M. and De Freitas, N., 2010. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599.Google ScholarGoogle Scholar
  11. Biem, A., Butrico, M., Feblowitz, M., Klinger, T., Malitsky, Y., Ng, K., Perer, A., Reddy, C., Riabov, A., Samulowitz, H. and Sow, D., 2015. Towards cognitive automation of data science. in Proceeding of Twenty-Ninth AAAI Conference on Artificial Intelligence. 4268--4269.Google ScholarGoogle Scholar
  12. Khurana, U., Parthasarathy, S. and Turaga, D. S. 2014. READ: Rapid data Exploration, Analysis and Discovery. in EDBT. 612--615.Google ScholarGoogle Scholar
  13. Wistuba, M., Schilling, N. and Schmidt-Thieme, L. 2017. Automatic Frankensteining: Creating complex ensembles autonomously. in Proceedings of the 2017 SIAM International Conference on Data Mining. 741--749.Google ScholarGoogle Scholar
  14. Olson, R.S., Urbanowicz, R.J., Andrews, P.C., Lavender, N.A. and Moore, J.H. 2016. March. Automating biomedical data science through tree-based pipeline optimization. In Proceedings of European Conference on the Applications of Evolutionary Computation. 123--137.Google ScholarGoogle Scholar
  15. Kinnear, K.E., Langdon, W.B., Spector, L., Angeline, P.J. and O'Reilly, U.M. eds., 1999. Advances in genetic programming (Vol. 3). MIT press.Google ScholarGoogle Scholar
  16. Affenzeller, M., Wagner, S., Winkler, S. and Beham, A., 2009. Genetic algorithms and genetic programming: modern concepts and practical applications. Chapman and Hall/CRC.Google ScholarGoogle Scholar
  17. Langdon, W. B., and Harman, M. 2014. Optimizing existing software with genetic programming. IEEE Transactions on Evolutionary Computation. 19(1), 118--135.Google ScholarGoogle Scholar
  18. Liu, L., Shao, L., Li, X. and Lu, K., 2015. Learning spatiotemporal representations for action recognition: A genetic programming approach. IEEE transactions on cybernetics. 46(1), pp.158--170.Google ScholarGoogle Scholar
  19. Chen, S., Chen, X., Cheng, Q., & Shevlin, T. 2010. Are family firms more tax aggressive than non-family firms?. Journal of Financial Economics. 95(1), 41--61.Google ScholarGoogle Scholar
  20. Wahab, A., Aswadi, E., Ariff, A.M., Madah Marzuki, M. and Mohd Sanusi, S., 2017. Political connections, corporate governance, and tax aggressiveness in Malaysia. Asian Review of Accounting. 25(3), 1--54.Google ScholarGoogle Scholar
  21. Lismont, J., Cardinaels, E., Bruynseels, L., De Groote, S., Baesens, B., Lemahieu, W., & Vanthienen, J. (2018). Predicting tax avoidance by means of social network analytics. Decision Support Systems. 108, 13--24.Google ScholarGoogle Scholar
  22. Rahman, R.A, Masrom, S. and Omar, N. 2019. Tax Avoidance Detection Based on Machine Learning of Malaysian Government-Linked Companies, 2(1), 535--541.Google ScholarGoogle Scholar

Index Terms

  1. Automated Machine Learning with Genetic Programming on Real Dataset of Tax Avoidance Classification Problem

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICEIT 2020: Proceedings of the 2020 9th International Conference on Educational and Information Technology
      February 2020
      268 pages
      ISBN:9781450375085
      DOI:10.1145/3383923

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 April 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader