Abstract
Most of the current Machine Learning applications in cheminformatics are black box applications. Support vector machine and neural networks are the most used classification techniques in prediction of the mutagenic activity of compounds. The problem of these techniques is that the rules/reasons of prediction are unknown. The rules could show the most important features/descrpitors of the compounds and the relations among them. This article proposes a model for generating the rules that governs prediction through the rough set theory. These rules, which based on two levels of selection for the highly discriminating power features, are visualized by lattice generated using the formal concept analysis approach. That is, better understanding of the reasons that leads to the mutagenic activity can be obtained. The resulted lattice shows that lipophilicity, number of nitrogen atoms, and electronegativity are the most important parameters in mutagenicity detection. Moreover, experimental results are compared against previous researches for validating the proposed model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Brown, N.: Chemoinformatics: an introduction for computer scientists. ACM Computing Surveys 41(2), Article 8 (2009)
Xu, J., Hagler, A.: Chemoinformatics and Drug Discovery. Molecules 7, 566–600 (2002)
Katritzky, A.R., Pacureanu, L., Dobchev, D., Karelson, M.: QSPR Study of Critical Micelle Concentration of Anionic Surfactants Using Computational Molecular Descriptors. Journal of Chemical Information and Modeling 47(3), 782–793 (2007)
Liu, K., Feng, J., Young, S.S.: PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. Journal of Chemical Information and Modeling 45(2), 515–522 (2005)
Salama, M.A., El-Bendary, N., Hassanien, A.E., Revett, K., Fahmy, A.A.: Interval based attribute evaluation algorithm. In: Proc. The Federated Conference on Computer Science and Information Systems, FedCSIS 2011, Szczecin, Poland, pp. 153–156 (2011)
Thabtah, F., Eljinini, M., Zamzeer, M., Hadi, W.: Naieve Bayesian based on Chi Square to Categorize Arabic Data. In: Proc. The 11th International Business Information Management Association Conference, IBIMA, on Innovation and Knowledge Management in Twin Track Economies, Cairo, Egypt, pp. 930–935 (2009)
Eid, H.F., Salama, M.A., Hassanien, A.E., Kim, T.-H.: Bi-Layer Behavioral-Based Feature Selection Approach for Network Intrusion Classification. In: Kim, T.-H., Adeli, H., Fang, W.-C., Garca-Villalba, L.J., Arnett, K.P., Khan, M.K. (eds.) Proc. Security Technology - International Conference, SecTech 2011, Part of the Future Generation Information Technology Conference, FGIT 2011, Jeju Island, Korea, pp. 195–203 (2011)
Al-Qaheri, H., Hassanien, A.E., Abraham, A.: A Generic Scheme for Generating Prediction Rules Using Rough Sets. In: Abraham, A., Falcan, R., Bello, R. (eds.) Rough Set Theory: A True Landmark in Data Analysis. SCI, vol. 174, pp. 163–186. Springer, Heidelberg (2009)
Motameny, S., Versmold, B., Schmutzler, R.: Formal Concept Analysis for the Identification of Combinatorial Biomarkers in Breast Cancer. In: Medina, R., Obiedkov, S. (eds.) ICFCA 2008. LNCS (LNAI), vol. 4933, pp. 229–240. Springer, Heidelberg (2008)
Kazius, J., McGuire, R., Bursi, R.: Derivation and Validation of Toxicophores for Mutagenicity Prediction. J. Med. Chem. 48(1), 312–320 (2005)
ChemAxon Software, http://www.chemaxon.com/ (last accessed: January 2013)
Kuznetsov, S.O.: Machine Learning and Formal Concept Analysis. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 287–312. Springer, Heidelberg (2004)
WEKA: Waikato Environment for Knowledge Analysis, version 3.5.9, http://www.cs.waikato.ac.nz/ml/weka/ (last accessed: January 2013)
Du, Q., Mezey, P.G., Chou, K.C.: Heuristic Molecular Lipophilicity Potential (HMLP): A 2D-QSAR Study to LADH of Molecular Family Pyrazole and Derivatives. Journal of Computational Chemistry 26(5), 461–470 (2005)
Bhattacharjee, A.K., Kyle, D.E., Vennerstrom, J.L., Milhous, W.K.: A 3D QSAR pharmacophore model and quantum chemical structure–activity analysis of chloroquine(CQ)-resistance reversal. Journal of Chemical Information and Computer Sciences 42(5), 1212–1220 (2002)
Rosenkranz, H.S., Klopman, G.: Relationships between electronegativity and genotoxicity. Mutation Research 328(2), 215–227 (1995)
Yamada, K., Hakura, A., Kato, T.A., Mizutani, T., Saeki, K.: Nitrogen-substitution effects on the mutagenicity and cytochrome P450 isoform-selectivity of chrysene analogs. Mutat Res 586(1), 87–95 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Salama, M.A., Fouad, M.M.M., El-Bendary, N., Hassanien, A.E.O. (2014). Mutagenicity Analysis Based on Rough Set Theory and Formal Concept Analysis. In: Thampi, S., Abraham, A., Pal, S., Rodriguez, J. (eds) Recent Advances in Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 235. Springer, Cham. https://doi.org/10.1007/978-3-319-01778-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-01778-5_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01777-8
Online ISBN: 978-3-319-01778-5
eBook Packages: EngineeringEngineering (R0)