Skip to main content

Tax Underreporting Detection Using an Unsupervised Learning Approach

  • Conference paper
  • First Online:
Advances in Soft Computing (MICAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15247))

Included in the following conference series:

  • 147 Accesses

Abstract

Governmental adminstrative domains can potentially benefit from a wide variety of currently available big data analysis methods. The tax administration is such an area that requires massive data processing to identify hidden patterns and trends of possible tax evasion. The use of supervised methods can be effective in these cases, but the lack of available labeled data limits their practical application in real-world scenarios. An alternative is the use of unsupervised methods, which have potential benefits in certain cases. In this sense, unsupervised methods are considered to be feasible as a decision support tool in tax evasion risk management systems. This paper proposes an unsupervised approach to identify signs of tax evasion by detecting, possible, tax underreporting. The proposed strategy is evaluated on a data set associated with individual income tax statistics of the United States. The results achieved are considered to be useful in decision-making and preventive actions on cases reported as suspicious.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. ACM SIGMOD Rec. 28(2), 49–60 (1999)

    Article  Google Scholar 

  2. Bai, L., Liang, J.: A categorical data clustering framework on graph representation. Pattern Recogn. 128, 108694 (2022)

    Article  Google Scholar 

  3. Center, T.P.: The state of state (and local) tax policy (2023). https://www.taxpolicycenter.org/briefing-book/how-do-state-and-local-corporate-income-taxes-work. Accessed 3 Mar 2023

  4. De Roux, D., Perez, B., Moreno, A., Villamil, M.D.P., Figueroa, C.: Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 215–222 (2018)

    Google Scholar 

  5. Devassy, B.M., George, S.: Dimensionality reduction and visualisation of hyperspectral ink data using t-SNE. Forensic Sci. Int. 311, 110194 (2020)

    Article  Google Scholar 

  6. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)

    Google Scholar 

  7. Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics, 857–871 (1971)

    Google Scholar 

  8. IRS: Individual income tax statistics data set (2023). https://www.irs.gov/pub/irs-soi/19zpallnoagi.csv. Accessed 1 Mar 2023

  9. Kassa, E.T.: Factors influencing taxpayers to engage in tax evasion: evidence from Woldia City administration micro, small, and large enterprise taxpayers. J. Innov. Entrepreneurship 10(1), 1–16 (2021)

    Article  MathSciNet  Google Scholar 

  10. Mehta, P., Mathews, J., Bisht, D., Suryamukhi, K., Kumar, S., Babu, C.S.: Detecting tax evaders using TrustRank and spectral clustering. In: Abramowicz, W., Klein, G. (eds.) BIS 2020. LNBIP, vol. 389, pp. 169–183. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53337-3_13

    Chapter  Google Scholar 

  11. Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. (CSUR) 54(2), 1–38 (2021)

    Article  Google Scholar 

  12. Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min. Knowl. Disc. 2, 169–194 (1998)

    Article  Google Scholar 

  13. Savić, M., Atanasijević, J., Jakovetić, D., Krejić, N.: Tax evasion risk management using a hybrid unsupervised outlier detection method. Expert Syst. Appl. 193, 116409 (2022)

    Article  Google Scholar 

  14. Schultz, M., Tropmann-Frick, M.: Autoencoder neural networks versus external auditors: detecting unusual journal entries in financial statement audits. In: Hawaii International Conference on System Sciences (2020)

    Google Scholar 

  15. Vâlsan, C., Druică, E., Ianole-Călin, R.: State capacity and tolerance towards tax evasion: first evidence from Romania. Adm. Sci. 10(2), 33 (2020)

    Article  Google Scholar 

  16. Vanhoeyveld, J., Martens, D., Peeters, B.: Value-added tax fraud detection with scalable anomaly detection techniques. Appl. Soft Comput. 86, 105895 (2020). https://doi.org/10.1016/j.asoc.2019.105895, https://www.sciencedirect.com/science/article/pii/S1568494619306763

  17. Wang, G., Ma, J., Chen, G.: Attentive statement fraud detection: distinguishing multimodal financial data with fine-grained attention. Decis. Support Syst., 113913 (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vitali Herrera-Semenets .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Herrera-Semenets, V., Bustio-Martínez, L., González-Ordiano, J.Á., van den Berg, J. (2025). Tax Underreporting Detection Using an Unsupervised Learning Approach. In: Martínez-Villaseñor, L., Ochoa-Ruiz, G. (eds) Advances in Soft Computing. MICAI 2024. Lecture Notes in Computer Science(), vol 15247. Springer, Cham. https://doi.org/10.1007/978-3-031-75543-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-75543-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-75542-2

  • Online ISBN: 978-3-031-75543-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics