Skip to main content

Advertisement

Log in

Genetic algorithm based neural classifiers for factor subset extraction

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

A clear understanding of risk factors is important to develop appropriate prevention and control strategies for infection caused by such pathogens as Salmonella Typhimurium. In this study, there are 91 risk factors that nonlinearly contribute to the Salmonella Typhimurium infection and many of them are not of significance. It is very important to automatically extract a factor subset with those important risk factors. This paper proposes a genetic algorithm for factor subset extraction in conjunction with neural and statistical classifiers to classify case and control status in Salmonella Typhimurium infection. The results show that the proposed approach is able to find an appropriate factor subset and the proposed neural classifiers outperform the traditional statistical classifiers. A statistical analysis is conducted by varying the parameters in the genetic algorithm based neural classifier to minimise the prediction error and determine the optimal system configuration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aha D, Bankert R (1994) Feature selection for case-based classification of cloud types: an empirical comparison. In: Proceedings of the AAAI-94 Workshop on case-based reasoning, pp 106–112

  • Allam M, Castillo A, Daz-Molina C and Navajas R (2002). Invasive pulmonary aspergillosis: identification of risk factors. J Infect Dis 34: 819–822

    Article  Google Scholar 

  • Anderson D, Murdoch D, Sexton D, Reller L, Stout J, Cabell C and Corey G (2004). Risk factors for infective endocarditis in patients with enterococcal bacteremia: a case-control study. J Infect 32: 72–77

    Article  Google Scholar 

  • Bowman C, Flint J, Pollari F (2003) Canadian integrated surveillance report of Salmonella, Campylobacter, E. coli and Shigella from 1996 to 1999. Technical Report 2951, Division of Enteric, Foodborne and Waterborne Diseases, Centre for Infectious Disease Prevention and Control, Population and Public Health Branch, Health Canada

  • Chowers M, Gottesman B, Paul M, Weinberger M, Pitlik S and Leibovici L (2003). Persistent bacteremia in the absence of defined intravascular foci: clinical significance and risk factors. Eur J Clin Microbiol Infect Dis 22: 592–596

    Article  Google Scholar 

  • Cornfield J (1951). A method of estimating comparative rates from clinical data. Applications to cancer of the lung, breast and cervix. J Natl Cancer Inst 11: 1269–1275

    Google Scholar 

  • Cornfield J (1962) Joint dependence of risk of coronary heart disease on serum cholesterol and systolic blood pressure: a discriminant function analysis. In: Federation Proceedings, vol 21. pp 58–61

  • Cost S and Salzberg S (1996). A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10: 57–78

    Google Scholar 

  • Davies R, Dalziel R, Gibbens J, Wilesmith J, Ryan J, Evans S, Byme C, Paiba G, Pascoe S and Teale C (2004). National survey for Salmonella in pigs, cattle and sheep at slaughter in Great Britain. J Appl Microbiol 96: 750–760

    Article  Google Scholar 

  • Demczuk W, Ahmed R, Woodward D, Clark C, Cuff W, Rodgers F (2001) Laboratory surveillance data for enteric pathogens in Canada, 2000 annual summary. Technical report Health Canada

  • Dore K, Buxton J, Henry B, Pollari F, Middleton D, Fyfe M, Ahmed R, Michel P, King A, Tinga C and Wilson J (2004). Risk factors for Salmonella Typhimurium DT104 and non-DT104 infection: a Canadian multi-provincial case-control study. Epidemiol Infect 132: 485–493

    Article  Google Scholar 

  • El-Solh A, Hsiao C, Goodnough S, Serghani J and Grant B (2004). Predicting active pulmonary tuberculosis using an artificial neural network. Medinfo 116: 968–973

    Google Scholar 

  • Gart JJ (1971). The comparison of proportions: a review of significance tests, confidence intervals nd adjustments for stratification. Rev Int Stat Inst 39: 148–169

    Article  Google Scholar 

  • Guerra-Salcedo C, Chen S, Whitley D, Smith S (1999) Fast and accurate feature selection using hybrid genetic strategies. In: Proceedings of the congress on evolutionary computation, vol 1. pp 177–184

  • Hajmeer M, Basheer I (2003) Comparison of logistic regression and neural network-based classifiers for bacterial growth. Food Microbiol 43–45

  • Haykin S (1999). Neural Networks: a comprehensive foundation. 2nd edn. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • John G, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: Proceedings of 11th international conference machine learning, Morgan Kaufmann, San Francisco, USA, pp 121–129

  • Jung Y and Beuchat L (1999). Survival of multidrug-resistant Salmonella Typhimurium DT104 in egg powders as affected by water activity and temperature. Int J Food Microbiol 49: 1–8

    Article  Google Scholar 

  • Kahane LH (2001). Regression basics. SAGE, Beverly Hills

    Google Scholar 

  • Khan M, Ding Q, Perrizo W: K-nearest neighbor classification of spatial data streams using p-trees. In: The 6th Pacific Asian conference on knowledge discovery and data mining, vol 2336. Taipei, Taiwan, pp 517–518

  • Kohavi R, Somerfield D (1995) Feature subset selection using the wrapper method: overfitting and dynamic search space topology. In: The first international conference on knowledge discovery and data mining (KDD’ 95), Menlo Park, CA, USA

  • Lagor C, Aronsky D, Fiszman M and Haug P (2001). Automatic identification of patients eligible for a pneumonia guideline: comparing the diagnostic accuracy of two decision support models. Medinfo 10: 493–497

    Google Scholar 

  • Liu H, Setiono R (1966) Aprobabilistic approach to feature selection-a filter solution. In: Proceedings of 13th international conference machine learning, Morgan, Kaufmann, pp 319–327

  • McBride GB (2006). Using statistical methods for water quality management. Technometrics 48: 156–157

    Google Scholar 

  • Song X, Xiao M and Yu R (1994). Artificial neural networks applied to classification of mutagenic activity of nitro-substituted polycyclic aromatic hydrocarbons. Comput Chem 18: 391–396

    Article  Google Scholar 

  • Stege H, Christensen J, Nielsen J, Baggesen D, Enue C and Willegerg P (2000). Prevalence of subclinical Salmonella enterica infection in Danish finishing pig herds. Prev Vet Med 44: 175–188

    Article  Google Scholar 

  • Sumner J, Raven G and Givney R (2004). Have changes to meat and poultry food safety regulation in Australia affected the prevalence of Salmonella or of salmonellosis. Food Microbiol 92: 199–205

    Article  Google Scholar 

  • Suriyasathaporn W, Schukken Y, Nielen M and Brand A (2000). Low somatic cell count: a risk factor for subsequent clinical mastitis in a dairy herd. J Dairy Sci 83: 1248–1256

    Google Scholar 

  • Vafaie H, Imam I (1994) Feature selection methods: Genetic algorithms versus greedy-like search. In: Proceedings of the international conference on fuzzy and intelligent control systems

  • Velinga J, Wilpshaar H, Frankena K, Bartels C and Barkema H (2000). Risk factors for clinical Salmonella enterica subsp. enterica serovar Typhimurium infection on Dutch dairy farms. Prev Vet Med 54: 157–168

    Article  Google Scholar 

  • Xiong R, Xie G, Edmondson A and Meullenet J (2002). Neural network modelling of the fate of Salmonella enterica serovarenteritidis pt4 in home-made ayonnaise prepared with citric acid. Food Control 13: 525–533

    Article  Google Scholar 

  • Yang SX (2003). A neural network model for dose-response of foodborne pathogens. Appl Soft Comput 3: 85–96

    Article  Google Scholar 

  • Yang J and Honavar V (1998). Feature subset selection using a genetic algorithm. IEEE Intell Syst 13: 44–49

    Article  Google Scholar 

  • Yu C (2004) Soft computing approaches for microbial food safety applications. Master’s thesis, School of Engineering, University of Guelph, Guelph, Canada

  • Zadoks R, Allore H, Barkema H, Sampimon O, Wellenberg G, Grahn Y and Schukken Y (2001). Cow and quarter level risk factors for streptococcus uberis and staphylococcus aureus mastitis. J Dairy Sci 84: 2649–2663

    Article  Google Scholar 

  • Zhang P, Verma B and Kumar K (2005). Neural vs. statistical classifier in conjunction with genetic algorithm based feature selection. Pattern Recognit Lett 26: 909–919

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon X. Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, L., Yang, S.X., Pollari, F. et al. Genetic algorithm based neural classifiers for factor subset extraction. Soft Comput 12, 623–632 (2008). https://doi.org/10.1007/s00500-007-0248-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-007-0248-x

Keywords

Navigation