Skip to main content

Advertisement

Log in

Coal mining accident causes classification using voting-based hybrid classifier (VHC)

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Labor safety at workplaces is a critical human rights concern in all industries around the world. Coal mines are considered one of the most dangerous workplaces and every year thousands of miners around the world die or get severe injuries in mining accidents. To make efficient technology-based accident mitigation plans for such work environments, the analysis of reasons which cause these accidents is of great value. This study contributes to the coal mines domain and proposed an approach using machine learning techniques to identify the reasons for the accidents. In our approach, a dataset containing the causes for accidents in text form that occurred in the past in coal mines has been used. We performed preprocessing to clean text data and then extract features to train the machine learning model using the term frequency-inverse document frequency (TF-IDF) technique. This study proposed the voting-based hybrid classifier (VHC) which is a combination of three individual machine learning models random forest, support vector classifier, and logistic regression using soft voting criteria. Evaluation of the model has been done in terms of accuracy, precision, recall, and f1 score. VHC outperforms all other stat of the art models by achieving the highest 0.96 accuracy score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availibility

The used dataset in experiments is publicly available at Kaggle on the given linkhttps://www. kaggle.com/furqanrustam118/coal-minin-datase.

References

  • Ajayi A, Oyedele L, Delgado JM, Akanbi L, Bilal M, Akinade O, Olawale O (2019) Big data platform for health and safety accident prediction. World J Sci Technol Sustain Dev 2019:1

    Google Scholar 

  • Bei Y (2008) An evaluation of text classification methods for literary study. Liter Linguist Comput 23(3):327–343

    Article  Google Scholar 

  • Bennett JD, Passmore DL (1984) Probability of death, disability, and restricted work activity in united states underground bituminous coal mines, 1975–1981. J Saf Res 15(2):69–76

    Article  Google Scholar 

  • Biau G, Scornet E (2016) A random forest guided tour. TEST 25(2):197–227

    Article  MathSciNet  MATH  Google Scholar 

  • Bocca FF, Rodrigues LHA (2016) The effect of tuning, feature engineering, and feature selection in data mining applied to rainfed sugarcane yield modelling. Comput Electron Agric 128:67–76

    Article  Google Scholar 

  • Braga PL, Oliveira ALI, Meira SRL (2007a) Software effort estimation using machine learning techniques with robust confidence intervals. In: 7th international conference on hybrid intelligent systems (HIS 2007), pp 352–357, IEEE

  • Braga PL, Oliveira ALI, Ribeiro GHT, Meira SRL(2007b) Bagging predictors for estimation of software project effort. In: 2007 international joint conference on neural networks, pp 1595–1600, IEEE

  • Chen C-W, Tseng S-P, Wang J-F (2021) Outpatient text classification system using lstm. J Inf Sci Eng 37:2

    Google Scholar 

  • Cheng M-Y, Kusoemo D, Gosno RA (2020) Text mining-based construction site accident classification using hybrid supervised machine learning. Autom Constr 118:103265

    Article  Google Scholar 

  • Chu C, Jain R, Muradian N, Zhang G (2016) Statistical analysis of coal mining safety in china with reference to the impact of technology. J South Afr Inst Min Metall 116(1):73–78

    Article  Google Scholar 

  • Elorrieta F, Eyheramendy S, Jordán A, Dékány I, Catelan M, Angeloni R, Alonso-García J, Contreras-Ramos R, Gran F, Hajdu G et al (2016) A machine learned classifier for rr lyrae in the vvv survey. Astron Astrophys 595:A82

    Article  Google Scholar 

  • Fang W, Luo H, Xu S, Love PED, Lu Z, Ye C (2020) Automated text classification of near-misses from safety reports: an improved deep learning approach. Adv Eng Inf 44:101060

    Article  Google Scholar 

  • Gerassis S, Saavedra Á, Taboada J, Alonso E, Bastante FG (2020) Differentiating between fatal and non-fatal mining accidents using artificial intelligence techniques. Int J Min Reclam Environ 34(10):687–699

    Article  Google Scholar 

  • Hai-bin LIU, Hui LGRH (2007) Study on characteristics of coal mine intrinsic safety and strategies of management. China Saf Sci J (CSSJ) 4:12

    Google Scholar 

  • Hu X, Downie JS, Ehmann AF (2009) Lyric text mining in music mood classification. Am Music 183(5,049):2–209

    Google Scholar 

  • Huang YJ, Powers R, Montelione GT (2005) Protein nmr recall, precision, and f-measure scores (rpf scores): structure quality assessment measures based on information retrieval statistics. J Am Chem Soc 127(6):1665–1674

    Article  Google Scholar 

  • Hull BP, Leigh J, Driscoll TR, Mandryk J (1996) Factors associated with occupational injury severity in the new south wales underground coal mining industry. Saf Sci 21(3):191–204

    Article  Google Scholar 

  • Husain V (2005) Obstacles in the sustainable development of artisanal and small-scale mines in Pakistan and remedial measures. Geol Soc Lond Spec Publ 250(1):135–140

    Article  Google Scholar 

  • Indrasiri RD, Pubudu L, Lee E, Rupapara V, Rustam F, Imran A (2021) Malicious traffic detection in iot and local networks using stacked ensemble classifier. Comput Mater Continua 71(1):489–515

    Google Scholar 

  • Issac B, Jap WJ(2009) Implementing spam detection using bayesian and porter stemmer keyword stripping approaches. In: TENCON 2009-2009 IEEE Region 10 Conference, pp 1–5, IEEE

  • Jamil R, Ashraf I, Rustam F, Saad E, Mehmood A, Choi GS (2021) Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model. PeerJ Comput Sci 7:e645

    Article  Google Scholar 

  • Lawrence KD, Marsh LC (1984) Robust ridge estimation methods for predicting us coal mining fatalities. Commun Stat Theory Methods 13(2):139–149

    Article  Google Scholar 

  • Lin B, Raza MY (2019) Analysis of energy related co2 emissions in Pakistan. J Clean Prod 219:981–993

    Article  Google Scholar 

  • Punmiya R, Choe S (2019) Energy theft detection using gradient boosting theft detector with feature engineering-based preprocessing. IEEE Trans Smart Grid 10(2):2326–2329

    Article  Google Scholar 

  • Rupapara V, Rustam F, Shahzad HF, Mehmood A, Ashraf I, Choi GS (2021) Impact of smote on imbalanced text features for toxic comments classification using rvvc model. IEEE Access 2021:2

    Google Scholar 

  • Rustam F, Ashraf I, Mehmood A, Ullah S, Choi GS (2019) Tweets classification on the base of sentiments for us airline companies. Entropy 21(11):1078

    Article  Google Scholar 

  • Rustam F, Mehmood A, Ahmad M, Ullah S, Khan DM, Choi GS (2020) Classification of shopify app user reviews using novel multi text features. IEEE Access 8:30234–30244

    Article  Google Scholar 

  • Sanmiquel L, Rossell JM, Vintró C (2015) Study of spanish mining accidents using data mining techniques. Saf Sci 75:49–55

    Article  Google Scholar 

  • Sanmiquel L, Bascompta M, Rossell JM, Anticoi HF, Guash E (2018) Analysis of occupational accidents in underground and surface mining in Spain using data-mining techniques. Int J Environ Res Public Health 15(3):462

    Article  Google Scholar 

  • Sarkar BK, Sana SS (2019) An e-healthcare system for disease prediction using hybrid data mining technique. J Modell Manage 2019:5

    Google Scholar 

  • Sarkar BK, Sana SS (2020) A conceptual distributed framework for improved and secured healthcare system. Int J Healthcare Manage 13(sup1):74–87

    Article  Google Scholar 

  • Sarkar BK, Sana SS, Chaudhuri K (2012) A genetic algorithm-based rule extraction system. Appl Soft Comput 12(1):238–254

    Article  Google Scholar 

  • Sarkar S, Ejaz N, Kumar M, Maiti J (2020) Root cause analysis of incidents using text clustering and classification algorithms. In: Proceedings of ICETIT 2019, pp 707–718. Springer, Berlin

  • Tarshizi E, Buche MW, Inti B, Chappidi R (2018) Text mining analysis of us department of labor’s MSHA fatal accident reports for coal mining. Mining Eng 70:4

    Article  Google Scholar 

  • Ting SL, Ip WH, Tsang AHC et al (2011) Is naive bayes a good classifier for document classification. Int J Softw Eng Appl 5(3):37–46

    Google Scholar 

  • Wang C, Zhang CL, Liu L (2014) Analysis on coal mine safety status in china and its countermeasures. Appl Mech Mater 448:3814–3817

    Google Scholar 

  • Zhao Y, Gao J, Yang X (2005) A survey of neural network ensembles. In: 2005 international conference on neural networks and brain, vol 1, pp 438–442, IEEE

  • Zhong B, Pan X, Love PED, Ding L, Fang W (2020) Deep learning and network analysis: classifying and visualizing accident narratives in construction. Autom Constr 113:103089

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Florida Center for Advanced Analytics and Data Science funded by Ernesto.Net (under the Algorithms for Good Grant)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Furqan Rustam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Javaid, A., Siddique, M.A., Reshi, A.A. et al. Coal mining accident causes classification using voting-based hybrid classifier (VHC). J Ambient Intell Human Comput 14, 13211–13221 (2023). https://doi.org/10.1007/s12652-022-03779-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-03779-z

Keywords

Navigation