Skip to main content

Advertisement

Log in

An effective text mining framework using adaptive principle component analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In data mining, the medical health records can be served as a rich knowledge sources. Due to the availability of numerous medical data, the constructive facts can be extracted by utilizing different data mining approaches. Many researches are conducted in the medical data mining field. However it faces various issues such as increased computational time, inaccurate results and computational complexities. To overcome these issues, a new approach is proposed in this work. Here the input medical records are taken as an input raw data. Then the raw data can be preprocessed using the process of parsing, stemming, stop word removal and POS tagging. Here the POS tagging helps to determine the medical terms in the sentences of the data. From the preprocessed data, the useful information can be extracted by applying association rule using genetic Principal Component Analysis algorithm. The best association rules are selected using Adaptive Principal Component Analysis and obtained the best association rules as a result. The performance of the proposed methodology is evaluated and compared with the existing techniques. For the AWFR, JRR and EHR dataset the precision value is obtained as 0.94%, 0.948% and 0.95% respectively compared with integral and SDM-3NC approach as 0.90 and 0.94 respectively. This proves the superiority of the proposed approach than the other techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

• Not Applicable.

Code availability

• Not Applicable.

References

  1. Azevedo A (2019) Data mining and knowledge discovery in databases. In: Advanced methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics, ed: IGI Global, pp 502–514. https://doi.org/10.4018/978-1-5225-2255-3.ch166

  2. Borah A, Nath B (2018) Identifying risk factors for adverse diseases using dynamic rare association rule mining. Expert Systems with Applications 113:233–263. https://doi.org/10.1016/j.eswa.2018.07.010

  3. Breiman L (1999) Pasting small votes for classification in large databases and on-line. Mach Learn 36:85–103

    Article  Google Scholar 

  4. Chen X, Zhang B, Wang T, Bonni A, Zhao G (2020) Robust principal component analysis for accurate outlier sample detection in RNA-Seq data. BMC Bioinf 21(1):1–20. https://doi.org/10.1186/s12859-020-03608-0

  5. Fu C, Wang X, Zhang L, Qiao L (2018) Mining algorithm for association rules in big data based on Hadoop. In: AIP Conference Proceedings, 1955(1):040035. AIP Publishing LLC. https://doi.org/10.1063/1.5033699

  6. Gautam J, Srivastava N (2015). Analysis of medical domain using CMARM: confabulation mapreduce association rule mining algorithm for frequent and rare itemsets. Int J Adv Comput Sci Appl 6(11):224–228. https://doi.org/10.14569/IJACSA.2015.061129

  7. Ji Y, Ying H, Tran J, Dews P, Lau S-Y, Massanari RM (2016) A functional temporal association mining approach for screening potential drug–drug interactions from electronic patient databases. Inform Health Soc Care 41:387–404

    Article  Google Scholar 

  8. Ji Y, Ying H, Tran J, Dews P, Massanari RM (2016) Integrating unified medical language system and association mining techniques into relevance feedback for biomedical literature search. BMC Bioinf 17:264

    Article  Google Scholar 

  9. Kargupta H, Kamath C (1999) Distributed and parallel data mining: emergence, growth and future directions. In: Hillol Kargupta and Philip Chan (eds) Advances in distributed data mining, AAAI Press

  10. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J 15:104–116

    Article  Google Scholar 

  11. Khamparia A, Pandey B (2020) A novel integrated principal component analysis and support vector machines-based diagnostic system for detection of chronic kidney disease. Int J Data Anal Tech Strat 12:99–113

    Article  Google Scholar 

  12. Li L, Lu R, Choo K-KR, Datta A, Shao J (2016) Privacy-preserving-outsourced association rule mining on vertically partitioned databases. IEEE Trans Inf Forensics Secur 11:1847–1861

    Article  Google Scholar 

  13. Muangprathub J, Jareonsuk Y, Sealiw A (2016) A web-based medical diagnostic system using data mining technique. J Telecommun, Electr Comput Eng (JTEC) 8:37–41

    Google Scholar 

  14. Nandhini M, Sivanandam SN (2015) An improved predictive association rule based classifier using gain ratio and T-test for health care data diagnosis. Sadhana 40(6):1683–1699. https://doi.org/10.1007/s12046-015-0410-6

  15. Nguyen D, Vo B, Le B (2015) CCAR: An efficient method for mining class association rules with itemset constraints. Eng Appl Artif Intell 37:115–124

    Article  Google Scholar 

  16. Nguyen D, Nguyen LT, Vo B, Pedrycz W (2016) Efficient mining of class association rules with the itemset constraint. Knowl-Based Syst 103:73–88

    Article  Google Scholar 

  17. Patel A, Gandhi S, Shetty S, Tekwani B (2017) Heart disease prediction using data mining, Int Res J Eng Technol 4(01):1705–1707

  18. Patel BM, Bhemwala VH, Patel AR (2018) "Analytical study of association rule mining methods in data mining,"

  19. Ramezankhani A, Pournik O, Shahrabi J, Azizi F, Hadaegh F (2015) "An application of association rule mining to extract risk pattern for type 2 diabetes using tehran lipid and glucose study database," Int J Endocrinol Metab, vol. 13

  20. Sachan A, Richariya V (2013) A survey on recommender systems based on collaborative filtering technique. Int J Innov Eng Technol (IJIET) 2:8–14

    Google Scholar 

  21. Seera M, Lim CP (2014) A hybrid intelligent system for medical data classification. Expert Syst Appl 41:2239–2249

    Article  Google Scholar 

  22. Shen C-C, Hu L-Y, Hu Y-H (2017) Comorbidity study of borderline personality disorder: Applying association rule mining to the Taiwan national health insurance research database. BMC Med Inform Decis Mak 17:8

    Article  Google Scholar 

  23. Simon GJ, Caraballo PJ, Therneau TM, Cha SS, Castro MR, Li PW (2015) Extending association rule summarization techniques to assess risk of diabetes mellitus. IEEE Trans Knowl Data Eng 27:130–141

    Article  Google Scholar 

  24. Sohail MN, Jiadong R, Uba MM, Irshad M (2019) "A comprehensive looks at data mining techniques contributing to medical data growth: a survey of researcher reviews," in Recent Developments in Intelligent Computing, Communication and Devices, ed: Springer, pp. 21–26

  25. Sun W, Cai Z, Li Y, Liu F, Fang S, Wang G (2018) "Data processing and text mining technologies on electronic medical records: A review," J Healthcare Eng, 2018.

  26. Sundermann AJ, Miller JK, Marsh JW, Saul MI, Shutt KA, Pacey M et al (2019) Automated data mining of the electronic health record for investigation of healthcare-associated outbreaks. Infect Control Hosp Epidemiol 40:314–319

    Article  Google Scholar 

  27. Urmela S, Nandhini M (2019) A framework for distributed data mining heterogeneous classifier. Comput Commun 147:58–75. https://doi.org/10.1016/j.comcom.2019.08.010

  28. Yang H, Yang CC (2015) Using health-consumer-contributed data to detect adverse drug reactions by association mining with temporal analysis. ACM Trans Intell Syst Technol (TIST) 6:55

    Google Scholar 

Download references

Funding

• This research work was not funded by any organization/institute/agency.

Author information

Authors and Affiliations

Authors

Contributions

I Am Dr. K. Kala hereby State That The Manuscript Title Entitled “An Effective Text Mining Framework Using Adaptive Principle Component Analysis” Submitted To Multimedia Tools and Applications, I Confirm That This Work Is Original And Has Not Been Published Elsewhere, Nor Is It Currently Under Consideration For Publication Elsewhere. And I Am Head and Associate Professor in the Department of Computer Science in Nachiappa Swamigal Arts and Science College, Karaikudi, Koviloor, India.

Corresponding author

Correspondence to K. Kala.

Ethics declarations

Conflict of interest

• I confirm that this work is original and has either not been published elsewhere, or is currently under consideration for publication elsewhere.

Competing interests

• None of the authors have any competing interests in the manuscript.

Consent to participate

I confirm that any participants (or their guardians if unable to give informed consent, or next of kin, if deceased) who may be identifiable through the manuscript (such as a case report), have been given an opportunity to review the final manuscript and have provided written consent to publish.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kala, K. An effective text mining framework using adaptive principle component analysis. Multimed Tools Appl 81, 44467–44485 (2022). https://doi.org/10.1007/s11042-022-13285-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13285-1

Keywords

Navigation