Skip to main content
Log in

Autism spectrum disorder detection with kNN imputer and machine learning classifiers via questionnaire mode of screening

  • Research
  • Published:
Health Information Science and Systems Aims and scope Submit manuscript

Abstract

Autism spectrum disorder (ASD) is a neurodevelopmental disorder. ASD cannot be fully cured, but early-stage diagnosis followed by therapies and rehabilitation helps an autistic person to live a quality life. Clinical diagnosis of ASD symptoms via questionnaire and screening tests such as Autism Spectrum Quotient-10 (AQ-10) and Quantitative Check-list for Autism in Toddlers (Q-chat) are expensive, inaccessible, and time-consuming processes. Machine learning (ML) techniques are beneficial to predict ASD easily at the initial stage of diagnosis. The main aim of this work is to classify ASD and typical developed (TD) class data using ML classifiers. In our work, we have used different ASD data sets of all age groups (toddlers, adults, children, and adolescents) to classify ASD and TD cases. We implemented One-Hot encoding to translate categorical data into numerical data during preprocessing. We then used kNN Imputer with MinMaxScaler feature transformation to handle missing values and data normalization. ASD and TD class data is classified using Support vector machine, k-nearest-neighbor (KNN), random forest (RF), and artificial neural network classifiers. RF gives the best performance in terms of the accuracy of 100% with different training and testing data split for all four types of data sets and has no over-fitting issue. We have also examined our results with already published work, including recent methods like Deep Neural Network (DNN) and Convolution Neural Network (CNN). Even using complex architectures like DNN and CNN, our proposed methods provide the best results with low-complexity models. In contrast, existing methods have shown accuracy upto 98% with log-loss upto 15%. Our proposed methodology demonstrates the improved generalization for real-time ASD detection during clinical trials.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets used in this paper are available in references [15,16,17], and [14] and Section Questionnaire Datasets.

References

  1. Autism: World Health Organization, Autism spectrum disorders. World Health Organization. Last checked on 2022; 26, 07, 2022

  2. Heinsfeld AS, Franco AR, Craddock RC, Buchweitz A, Meneguzzi F. Identification of autism spectrum disorder using deep learning and the abide dataset. NeuroImage Clin. 2018;17:16–23. https://doi.org/10.1016/j.nicl.2017.08.017.

    Article  PubMed  Google Scholar 

  3. Azer SA, Bokhari RA, AlSaleh GS, Alabdulaaly MM, Ateeq KI, Guerrero APS, Azer S. Experience of parents of children with autism on youtube: are there educationally useful videos? Inform Health Soc Care. 2018;43(3):219–33. https://doi.org/10.1080/17538157.2018.1431238.

    Article  PubMed  Google Scholar 

  4. Franz L, Adewumi K, Chambers N, Viljoen M, Baumgartner JN, De Vries PJ. Providing early detection and early intervention for autism spectrum disorder in south Africa: stakeholder perspectives from the western cape province. J Child Adolesc Mental Health. 2018;30(3):149–65.

    Article  Google Scholar 

  5. Pagnozzi AM, Conti E, Calderoni S, Fripp J, Rose SE. A systematic review of structural mri biomarkers in autism spectrum disorder: a machine learning perspective. Int J Dev Neurosci. 2018;71:68–82. https://doi.org/10.1016/j.ijdevneu.2018.08.010.

    Article  PubMed  Google Scholar 

  6. Kosmicki J, Sochat V, Duda M, Wall D. Searching for a minimal set of behaviors for autism detection through feature selection-based machine learning. Transl Psychiatry. 2015;5(2):514–514. https://doi.org/10.1038/tp.2015.7.

    Article  Google Scholar 

  7. Bone D, Bishop SL, Black MP, Goodwin MS, Lord C, Narayanan SS. Use of machine learning to improve autism screening and diagnostic instruments: effectiveness, efficiency, and multi-instrument fusion. J Child Psychol Psychiatry. 2016;57(8):927–37. https://doi.org/10.1111/jcpp.12559.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Satu MS, Sathi FF, Arifen MS, Ali MH, Moni MA. Early detection of autism by extracting features: a case study in bangladesh. In: 2019 International conference on robotics, electrical and signal processing techniques (ICREST), pp. 400–405 (2019). IEEE. https://doi.org/10.1109/ICREST.2019.8644357

  9. Jumaa N, Salman A, Al-Hamdani D. The autism spectrum disorder diagnosis based on machine learning techniques. J Xian Univ Architect Technol. 2020;12:575–83.

    Google Scholar 

  10. Mujeeb Rahman K, Monica Subashini M. A deep neural network-based model for screening autism spectrum disorder using the quantitative checklist for autism in toddlers (qchat). J Autism Dev Disord. 2022;52(6):2732–46. https://doi.org/10.1007/s10803-021-05141-2.

    Article  CAS  PubMed  Google Scholar 

  11. Thabtah F, Spencer R, Abdelhamid N, Kamalov F, Wentzel C, Ye Y, Dayara T. Autism screening: an unsupervised machine learning approach. Health Inform Sci Syst. 2022;10(1):26.

    Article  Google Scholar 

  12. Allison C, Auyeung B, Baron-Cohen S. Toward brief red flags for autism screening: the short autism spectrum quotient and the short quantitative checklist in 1,000 cases and 3,000 controls. J Am Acad Child Adolesc Psychiatry. 2012;51(2):202–12. https://doi.org/10.1016/j.jaac.2011.11.003.

    Article  PubMed  Google Scholar 

  13. Thabtah F, Kamalov F, Rajab K. A new computational intelligence approach to detect autistic features for autism screening. Int J Med Inform. 2018;117:112–24. https://doi.org/10.1016/j.ijmedinf.2018.06.009.

    Article  PubMed  Google Scholar 

  14. Thabtah F. Autism screening data for Toddlers. Kaggle. Last checked on 2018; 26, 07, 2022

  15. Thabtah F. Autistic spectrum disorder screening data for children data set. University of California, Irvine, School of Information and Computer Sciences. Last checked on 2017; 26, 07, 2022

  16. Thabtah FF. Autistic spectrum disorder screening data for adolescent data set. University of California, Irvine, School of Information and Computer Sciences. Last checked on 2017; 26, 07, 2022

  17. Thabtah FF. Autism screening adult data set. University of California, Irvine, School of Information and Computer Sciences. Last checked on 2017; 26, 07, 2022

  18. Kumar CJ, Das PR. The diagnosis of asd using multiple machine learning techniques. Int J Dev Disabil. 2021. https://doi.org/10.1080/20473869.2021.1933730.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Musa RA, Manaa ME, Abdul-Majeed G. Predicting autism spectrum disorder (asd) for toddlers and children using data mining techniques. J Phys: Conf Ser. 2021;1804: 012089. https://doi.org/10.1088/1742-6596/1804/1/012089.

    Article  Google Scholar 

  20. Erkan U, Thanh DN. Autism spectrum disorder detection with machine learning methods. Curr Psychiatry Res Rev Form: Curr Psychiatry Rev. 2019;15(4):297–308. https://doi.org/10.2174/2666082215666191111121115.

    Article  Google Scholar 

  21. Vaishali R, Sasikala R. A machine learning based approach to classify autism with optimum behaviour sets. Int J Eng Technol. 2018;7(4):18.

    Google Scholar 

  22. Raj S, Masood S. Analysis and detection of autism spectrum disorder using machine learning techniques. Procedia Comput Sci. 2020;167:994–1004. https://doi.org/10.1016/j.procs.2020.03.399.

    Article  Google Scholar 

  23. Mohan P, Paramasivam I. Feature reduction using svm-rfe technique to detect autism spectrum disorder. Evol Intel. 2021;14(2):989–97. https://doi.org/10.1007/s12065-020-00498-2.

    Article  Google Scholar 

  24. Vakadkar K, Purkayastha D, Krishnan D. Detection of autism spectrum disorder in children using machine learning techniques. SN Comput Sci. 2021;2(5):1–9. https://doi.org/10.1007/s42979-021-00776-5.

    Article  Google Scholar 

  25. Omar KS, Mondal P, Khan NS, Rizvi MRK, Islam MN. A machine learning approach to predict autism spectrum disorder. In: 2019 International conference on electrical, computer and communication engineering (ECCE), pp. 1–6 (2019). IEEE. https://doi.org/10.1109/ECACE.2019.8679454

  26. Thabtah F, Peebles D. A new machine learning model based on induction of rules for autism detection. Health Inform J. 2020;26(1):264–86. https://doi.org/10.1177/1460458218824711.

    Article  Google Scholar 

  27. Akter T, Satu MS, Khan MI, Ali MH, Uddin S, Lio P, Quinn JM, Moni MA. Machine learning-based models for early stage detection of autism spectrum disorders. IEEE Access. 2019;7:166509–27. https://doi.org/10.1109/ACCESS.2019.2952609.

    Article  Google Scholar 

  28. Mohanty AS, Parida P, Patra K. Identification of autism spectrum disorder using deep neural network. J Phys: Conf Ser. 2021;1921: 012006. https://doi.org/10.1088/1742-6596/1921/1/012006.

    Article  Google Scholar 

  29. Biessmann F, Rukat T, Schmidt P, Naidu P, Schelter S, Taptunov A, Lange D, Salinas D. Datawig: missing value imputation for tables. J Mach Learn Res. 2019;20(175):1–6.

    MathSciNet  Google Scholar 

  30. Zhang S. Nearest neighbor selection for iteratively knn imputation. J Syst Softw. 2012;85(11):2541–52. https://doi.org/10.1016/j.jss.2012.05.073.

    Article  Google Scholar 

  31. Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40. https://doi.org/10.1007/BF00058655.

    Article  Google Scholar 

  32. Altman NS. An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat. 1992;46(3):175–85. https://doi.org/10.1080/00031305.1992.10475879.

    Article  MathSciNet  Google Scholar 

  33. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97. https://doi.org/10.1007/BF00994018.

    Article  Google Scholar 

  34. Anggoro DA, Novitaningrum D. Comparison of accuracy level of support vector machine (svm) and artificial neural network (ann) algorithms in predicting diabetes mellitus disease. ICIC Express Lett. 2021;15(1):9–18.

    Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Ministry of Education and Indian Institute of Information Technology, Allahabad for supplying the necessary materials required for completing this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trapti Shrivastava.

Ethics declarations

Conflict of interest

The authors have no competing interests relevant to the content of this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

See Tables 13, 14, 15, 16.

Table 13 Dataset features and missing values
Table 14 Features description of toddler dataset
Table 15 Training and testing data size with various splits
Table 16 Common features description of the toddler, children, adolescent and adult data-sets

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shrivastava, T., Singh, V. & Agrawal, A. Autism spectrum disorder detection with kNN imputer and machine learning classifiers via questionnaire mode of screening. Health Inf Sci Syst 12, 18 (2024). https://doi.org/10.1007/s13755-024-00277-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13755-024-00277-8

Keywords

Navigation