Skip to main content

Class Aware Exemplar Discovery from Microarray Gene Expression Data

  • Conference paper
  • First Online:
Big Data Analytics (BDA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9498))

Included in the following conference series:

  • 1746 Accesses

Abstract

Given a dataset, exemplars are subset of data points that can represent a set of data points without significance loss of information. Affinity propagation is an exemplar discovery technique that, unlike k–centres clustering, gives uniform preference to all data points. The data points iteratively exchange real–valued messages, until clusters with their representative exemplar become apparent.

In this paper, we propose a Class Aware Exemplar Discovery (CAED) algorithm, which assigns preference value to data points based on their ability to differentiate samples of one class from others. To aid this, CAED performs class wise ranking of data points, assigning preference value to each data point based on its class wise rank. While exchanging messages, data points with better representative ability are more favored for being chosen as exemplar over other data points.

The proposed method is evaluated over 18 gene expression datasets to check its efficacy for selection of relevant exemplars from large datasets. Experimental evaluation exhibits improvement in classification accuracy over affinity propagation and other state-of-art feature selection techniques. Class Aware Exemplar Discovery converges in lesser iterations as compared to affinity propagation thereby dropping the execution time significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)

    Article  Google Scholar 

  2. De Abreu, F.B., Wells, W.A., Tsongalis, G.J.: The emerging role of the molecular diagnostics laboratory in breast cancer personalized medicine. Am. J. Pathol. 183(4), 1075–1083 (2013)

    Article  Google Scholar 

  3. Kononenko, I., Šimec, E., Robnik-Šikonja, M.: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl. Intell. 7(1), 39–55 (1997)

    Article  Google Scholar 

  4. Hall, M.A.: Correlation-based feature selection for machine learning. Doctoral dissertation, The University of Waikato (1999)

    Google Scholar 

  5. Kashef, R., Kamel, M.S.: Efficient bisecting k-medoids and its application in gene expression analysis. In: Campilho, A., Kamel, M. (eds.) ICIAR 2008. LNCS, vol. 5112, pp. 423–434. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  8. De Souto, M.C., Costa, I.G., de Araujo, D.S., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinf. 9(1), 497 (2008)

    Article  Google Scholar 

  9. Foithong, S., Pinngern, O., Attachoo, B.: Feature subset selection wrapper based on mutual information and rough sets. Expert Syst. Appl. 39(1), 574–584 (2012)

    Article  Google Scholar 

  10. Mramor, M., Leban, G., Demšar, J., Zupan, B.: Visualization-based cancer microarray data classification analysis. Bioinformatics 23(16), 2147–2154 (2007)

    Article  Google Scholar 

  11. Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1/2), 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  12. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40, 16–28 (2014)

    Article  Google Scholar 

  13. Soufan, O., Kleftogiannis, D., Kalnis, P., Kalnis, B.: Bajic DWFS: a wrapper feature selection tool based on a parallel genetic algorithm. PLoS ONE 10, e0117988 (2015). doi:10.1371/journal.pone.0117988

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dhaval Patel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sharma, S., Agrawal, A., Patel, D. (2015). Class Aware Exemplar Discovery from Microarray Gene Expression Data. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27057-9_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27056-2

  • Online ISBN: 978-3-319-27057-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics