Differential Feature Recognition of Breast Cancer Patients Based on Minimum Spanning Tree Clustering and F-statistics

Xie, Juanying; Li, Ying; Zhou, Ying; Wang, Mingzhao

doi:10.1007/978-3-319-48335-1_21

Juanying Xie¹⁹,
Ying Li¹⁹,
Ying Zhou¹⁹ &
…
Mingzhao Wang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10038))

Included in the following conference series:

International Conference on Health Information Science

741 Accesses

Abstract

The differential feature recognition algorithm of breast cancer patients is presented in this paper based on minimum spanning tree (MST) and F-statistics. The algorithm uses the minimum spanning tree clustering algorithm to cluster features of breast cancer data and the F-statistics to determine the proper number of feature clusters. Features most relevant to class labels are selected from each feature cluster to comprise the differential features. After that, samples with recognized features are clustered via MST clustering algorithm. The validity of our algorithm is evaluated by its clustering accuracy on breast cancer dataset of WDBC. In the experiments, correlations between features and class labels and similarities between features are measured by the cosine similarity and Pearson correlation coefficient. Similarities between samples are measured by the cosine similarity, the Euclidean distance and the Pearson correlation coefficient. Experimental results show that the highest clustering accuracy can be got when the cosine similarity is used to measure correlations between features and class labels and similarities between features while the Euclidean distance is used to measure similarities between samples. The recognized features are: mean radius, mean fractal dimension and standard error of fractal dimension.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Differential Feature Detection and the Clustering Analysis to Breast Cancers

Comparative analysis of proficiencies of various textures and geometric features in breast mass classification using k-nearest neighbor

Article Open access 12 January 2022

Feature Selection Method Based on Chi-Square Test and Minimum Redundancy

References

Jiaqing, Z., Shu, W., Xinming, Q.: The present situation and version of breast cancer. Chin. J. Surg. 40(3), 161 (2002)
Google Scholar
Magendiran, N., Jayaranjani, J.: An efficient fast clustering-based feature subset selection algorithm for high-dimensional data. Int. J. Innov. Res. Sci. Eng. Technol. 3(1), 405–408 (2014)
Google Scholar
Yan, W., Wu, W.: Data Structure in C, pp. 173–176. Tsinghua University Press, Beijing (2007)
Google Scholar
Xie, J., Liu, C.: Fuzzy Mathematics Method and its Application, 2nd edn. Huazhong University of Science & Technology Press, Wuhan (2000)
Google Scholar
Xinbo, G., Jie, L., Dacheng, T., et al.: Fuzziness measurement of fuzzy sets and its application in cluster validity analysis. Int. J. Fuzzy Syst. 9(4), 188–197 (2007)
MathSciNet Google Scholar
Huang, Z., Michael, K.Ng.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst. 4(7), 446–452 (1999)
Article Google Scholar
Xie, J., Zhou, Y.: A new criterion for clustering algorithm. J. Shaanxi Norm. Univ. (Nat. Sci. Ed.) 43(6), 1–8 (2015)
MathSciNet MATH Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: An introduction to data mining, pp. 65–83. China Machine Press, Beijing (2010)
Google Scholar
UCI Machine Learning Repository [DB/OL], 24 March 2016. http://mlr.cs.umass.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
Li, W., Xianzhong, Z., Jie, S.: An improved rough k-means clustering algorithm. Control Decis. 27(11), 1711–1719 (2012)
MathSciNet Google Scholar
Jiyu, L., Qiang, W., Hao, S., Lvyun, Z.: Weighted KNN data classification algorithm based on rough set. Comput. Sci. 42(10), 281–286 (2015)
Google Scholar
Fan, M., Li, Z., Shi, X.: A clustering algorithm based on local center object. Comput. Eng. Sci. 36(9), 1611–1616 (2014)
Google Scholar
Qing, M., Juanying, X.: New k-medoids clustering algorithm based on granular computing. J. Comput. Appl. 32(7), 1973–1977 (2012)
Google Scholar

Download references

Acknowledgements

We are much obliged to those who share the datasets in the machine learning repository of UCI. This work is supported in part by the National Natural Science Foundation of China under Grant No. 61673251, is also supported by the Key Science and Technology Program of Shaanxi Province of China under Grant No. 2013K12-03-24, and is at the same time supported by the Fundamental Research Funds for the Central Universities under Grant No. GK201503067 and 2016CSY009, and by the Innovation Funds of Graduate Programs at Shaanxi Normal University under Grant No. 2015CXS028.

Author information

Authors and Affiliations

School of Computer Science, Shaanxi Normal University, Xi’an, 710062, People’s Republic of China
Juanying Xie, Ying Li, Ying Zhou & Mingzhao Wang

Authors

Juanying Xie
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Mingzhao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juanying Xie .

Editor information

Editors and Affiliations

Victoria University , Melbourne, Australia
Xiaoxia Yin
Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA
James Geller
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Ye Li
Victoria University , Melbourne, Australia
Rui Zhou
Centre for Applied Informatics, Victoria University, Melbourne, Australia
Hua Wang
Centre for Applied Informatics, Victoria University, Melbourne, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, J., Li, Y., Zhou, Y., Wang, M. (2016). Differential Feature Recognition of Breast Cancer Patients Based on Minimum Spanning Tree Clustering and F-statistics. In: Yin, X., Geller, J., Li, Y., Zhou, R., Wang, H., Zhang, Y. (eds) Health Information Science. HIS 2016. Lecture Notes in Computer Science(), vol 10038. Springer, Cham. https://doi.org/10.1007/978-3-319-48335-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-48335-1_21
Published: 15 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48334-4
Online ISBN: 978-3-319-48335-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics