Class Aware Exemplar Discovery from Microarray Gene Expression Data

Sharma, Shivani; Agrawal, Abhinna; Patel, Dhaval

doi:10.1007/978-3-319-27057-9_17

Shivani Sharma¹⁵,
Abhinna Agrawal¹⁵ &
Dhaval Patel¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9498))

Included in the following conference series:

International Conference on Big Data Analytics

1746 Accesses

Abstract

Given a dataset, exemplars are subset of data points that can represent a set of data points without significance loss of information. Affinity propagation is an exemplar discovery technique that, unlike k–centres clustering, gives uniform preference to all data points. The data points iteratively exchange real–valued messages, until clusters with their representative exemplar become apparent.

In this paper, we propose a Class Aware Exemplar Discovery (CAED) algorithm, which assigns preference value to data points based on their ability to differentiate samples of one class from others. To aid this, CAED performs class wise ranking of data points, assigning preference value to each data point based on its class wise rank. While exchanging messages, data points with better representative ability are more favored for being chosen as exemplar over other data points.

The proposed method is evaluated over 18 gene expression datasets to check its efficacy for selection of relevant exemplars from large datasets. Experimental evaluation exhibits improvement in classification accuracy over affinity propagation and other state-of-art feature selection techniques. Class Aware Exemplar Discovery converges in lesser iterations as compared to affinity propagation thereby dropping the execution time significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)
Article Google Scholar
De Abreu, F.B., Wells, W.A., Tsongalis, G.J.: The emerging role of the molecular diagnostics laboratory in breast cancer personalized medicine. Am. J. Pathol. 183(4), 1075–1083 (2013)
Article Google Scholar
Kononenko, I., Šimec, E., Robnik-Šikonja, M.: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl. Intell. 7(1), 39–55 (1997)
Article Google Scholar
Hall, M.A.: Correlation-based feature selection for machine learning. Doctoral dissertation, The University of Waikato (1999)
Google Scholar
Kashef, R., Kamel, M.S.: Efficient bisecting k-medoids and its application in gene expression analysis. In: Campilho, A., Kamel, M. (eds.) ICIAR 2008. LNCS, vol. 5112, pp. 423–434. Springer, Heidelberg (2008)
Chapter Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
De Souto, M.C., Costa, I.G., de Araujo, D.S., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinf. 9(1), 497 (2008)
Article Google Scholar
Foithong, S., Pinngern, O., Attachoo, B.: Feature subset selection wrapper based on mutual information and rough sets. Expert Syst. Appl. 39(1), 574–584 (2012)
Article Google Scholar
Mramor, M., Leban, G., Demšar, J., Zupan, B.: Visualization-based cancer microarray data classification analysis. Bioinformatics 23(16), 2147–2154 (2007)
Article Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1/2), 245–271 (1997)
Article MathSciNet MATH Google Scholar
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40, 16–28 (2014)
Article Google Scholar
Soufan, O., Kleftogiannis, D., Kalnis, P., Kalnis, B.: Bajic DWFS: a wrapper feature selection tool based on a parallel genetic algorithm. PLoS ONE 10, e0117988 (2015). doi:10.1371/journal.pone.0117988
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology-Roorkee, Roorkee, India
Shivani Sharma, Abhinna Agrawal & Dhaval Patel

Authors

Shivani Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Abhinna Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Dhaval Patel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dhaval Patel .

Editor information

Editors and Affiliations

University of Delhi, Delhi, India
Naveen Kumar
University of Delhi, Delhi, India
Vasudha Bhatnagar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sharma, S., Agrawal, A., Patel, D. (2015). Class Aware Exemplar Discovery from Microarray Gene Expression Data. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-27057-9_17
Published: 25 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27056-2
Online ISBN: 978-3-319-27057-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics