ABSTRACT
The number of breast cancer cases worldwide has risen sharply in the past 20 years. In 2000, the total number of cases was 1.05 million. By 2018, the number had increased to 2.09 million, an increase of 99.05 percent, with an average annual growth rate of more than 5 percent. Breast cancer has gradually become the most common cancer for women, so the development of anti-breast cancer drugs has become a hot topic in the current medical field. In this paper, a series of compound descriptors and their biological activity data were collected for ERα, a target associated with breast cancer, and compound activity was predicted by establishing a compound activity prediction model. According to the compound descriptors, more important features were selected. The Quantitative structure-activity Relationship (QSAR) model of compounds was constructed, and then the GBDT algorithm was used to predict the model. Finally, through comparative analysis of results, the method was fast and accurate. It plays a key role in drug research.
- Kunal, R. Thomas Leonard, J. (2005) Classical QSAR Modeling of Anti-HIV 2, 3-Diaryl-l, 3-thiazolidin-4-ones[J]. QSAR & combinatorial science, 24(5):579--592.Google Scholar
- Devillers, J. Flatin, J. (2000) A General QSAR Model for Predicting the Acute Toxicity of Pesticides toOncorhynchus mykiss[J]. SAR and QSAR in Environmental Research, 11(1):25--43.Google ScholarCross Ref
- Hong Wei, L. Zhi Liang, L. (2003) Structural Characterization of Pyrones through MEDV and Anti-HIV Activity Prediction by QSAR[J]. Chinese Journal of Organic Chemistry, 23(12):1370--1374.Google Scholar
- Li Ping, Z. Yong Hua, Z. Shu Shen, L. (2002) Structural Characterization of Cyclic Ureas through MEDV and Anti-HIV Activity Prediction by QSAR[J].Acta Chimica Sinica, 60(9):1688--1693.Google Scholar
- Bing, H. (2009) Prediction of blood-brain barrier penetrating drugs using Supporting Vector Machine[J].Computers and Applied Chemistry, 26(2):188--190.Google Scholar
- Jia, R., Yang, L., Li, Y., & Xin, Z. (2021). Gestures recognition of sEMG signal based on Random Forest. 2021 IEEE 16th Conference on Industrial Electronics and Applications (ICIEA), Industrial Electronics and Applications (ICIEA), 2021 IEEE 16th Conference On, 1673--1678.Google ScholarCross Ref
- MORAES, D. et al. Influence of Sample Size in Land Cover Classification Accuracy Using Random Forest and Sentinel-2 Data in Portugal. 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Geoscience and Remote Sensing Symposium IGARSS, 2021 IEEE International, [s. l.], p. 4232--4235, 2021.Google Scholar
- WARD, M. et al. Data Balanced Bagging Ensemble of Convolutional- LSTM Neural Networks for Time Series Data Classification with an Imbalanced Dataset. 2021 IEEE International Symposium on Circuits and Systems (ISCAS), [s. l.], p. 1--5, 2021.Google Scholar
- Yun Fan, Y. Fang, L. Shan, Z. (2020) Ensemble Learning Based on GBDT and CNN for AdoptabilityPrediction[J]. Computers, materials & continua, 65(2):1361--1372.Google Scholar
- Ghasemi, J. Saaidpour, S. Brown S. (2007) QSPR study for estimation of acidity constants of some aromatic acids derivatives using multiple linear regression (MLR) analysis[J]. Journal of Molecular Structure. Theochem: Applications of Theoretical Chemistry to Organic, Inorganic and Biological Problems, 805(1/3):27--32.Google Scholar
- Dang Qun, C. (2005) Analysis and Making Good Fitting Degree Test for Logistic Curve Regression Equation[J].Application of Statistics and Management, 24(1):112--115.Google Scholar
- Timothy, D. Barry, B. Janet, D. (1994) Evaluation of time‐series data sets using the Pearson product‐moment correlation coefficient[J].Medicine and Science in Sports and Exercise, 1994, 26(7):919--928.Google Scholar
Index Terms
- Prediction of anti-breast cancer compound activity based on Gradient Boosting Decision Tree ensemble learning
Recommendations
Classification and prediction model of compound pharmacokinetic properties based on ensemble learning method
ISAIMS '21: Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine SciencesIn this paper, the absorption, distribution, metabolism, excretion, and toxicity of compounds are modeled, and the classification prediction models of Caco-2, CYP3A4, HERG, hob and Mn in ADMET properties are constructed respectively. Firstly, the main ...
Breast cancer drug toxicity prediction Based on AdaBoost Extremely Random Tree
BIC 2022: 2022 2nd International Conference on Bioinformatics and Intelligent ComputingEstrogen Receptor α (ERα) is considered as an important target for treating breast cancer, so compounds that can antagonize ERα may be candidate drugs for breast cancer. We predict the toxicities of candidate compounds by machine learning to achieve the ...
Revealing determinant factors for early breast cancer recurrence by decision tree
Early breast cancer recurrence is indicative of poor response to adjuvant therapy and poses threats to patients' lives. Most existing prediction models for breast cancer recurrence are regression-based models and difficult to interpret. We apply a ...
Comments