Abstract
A Fast Outlier Sample Detection(FOSD) algorithm is proposed in this paper which can be used to recognize mislabeled samples or abnormal samples in microarray datasets. The proposed algorithm uses CL-stability alorithm as a basic operator. The Machine Learning method is used as classifier in the FOSD. The outlier samples are detected depending on the gobal stability of samples. Experimental results show that the FOSD algorithm is not only better than other existing algorithms, but also robust for detecting outlier samples in microarray dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)
West, M., et al.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proceedings of the National Academy of Sciences of the United States of America 98(30), 11462–11467 (2001)
Hawkin, D.: Identification of outlier. Chapman and Hall, London (1980)
Barnett, V., Lewis, T.: Outliers in statistical data. John Wiley & Sons, Chichester (1994)
Tucakov, V., Ng, R.: Identifying unusual people behavior: A case study of mining outliers in spatio_temporal trajectory databases. In: Proc. SIGMOD Workshop on Research Issues on Knowledge Discovery and Data Mining (1998)
Johnson, T., et al.: Fast Computation of 2-Dimensional Depth Contours. In: Proc. KDD, pp. 224–228 (1998)
Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings 24th International Conference Very Large Data Bases, VLDB, NY, USA, pp. 392–403 (1998)
Lu, X., et al.: A simple strategy for detecting outliers in microarray data. In: 8th Conference on Control, Automation, Robotics and Vision, Kunming, China, pp. 1331–1335 (2004)
Kadota, K., et al.: Detecting outlying samples in microarray data: a critical assessment of the effect of outliers on sample classification. Chem.-Bio. Inform. J. 3, 30–45 (2003)
Furey, T.S., et al.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16, 906–914 (2000)
Malossini, A., Blanzieri, E., Ng, R.: Detecting potential labeling errors in microarrays by data perturbation. Bioinformatics 17, 2114–2121 (2006)
Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: Proceedings of ACM SIGMOD 2001, Santa Barbara, CA, pp. 37–46 (2001)
Yan, C., et al.: Outlier analysis for gene expression data. J. Computer Sci. & Technol. 19, 13–21 (2004)
Li, L., et al.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Comb. Chem. High Through. Scr. 4, 727–739 (2001)
Kadota, K., et al.: Detecting outlying samples in microarray data: a critical assessment of the effect of outliers on sample classification. Chem.-Bio. Inform. J. 3, 30–45 (2003)
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotides array. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)
Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction bye gene expression monitoring. Science 286, 531–537 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, Y., Xing, C., Shen, W., Sun, Y., Wu, J., Zhou, X. (2011). A Fast Algorithm for Outlier Detection in Microarray. In: Lin, S., Huang, X. (eds) Advances in Computer Science, Environment, Ecoinformatics, and Education. CSEE 2011. Communications in Computer and Information Science, vol 215. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23324-1_83
Download citation
DOI: https://doi.org/10.1007/978-3-642-23324-1_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23323-4
Online ISBN: 978-3-642-23324-1
eBook Packages: Computer ScienceComputer Science (R0)