Skip to main content

Local Pattern Discovery in Array-CGH Data

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3539))

Abstract

We report in this paper about our practice of frequent pattern discovery algorithms in the context of mining biological data related to genomic alterations in cancer. A number of frequent item set methods have already been successfully applied to various biological data obtained from large scale analyses (see for instance [4] for SAGE data, [20,22,26] for gene expression data), and all of these have to face the peculiarity of such data wrt standard basket analysis data, namely that the number of observations is low wrt the number of attributes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albertson, D.G., Collins, C., McCormick, F., Gray, J.W.: Chromosome aberrations in solid tumors. Nat. Genet. 34, 369–376 (2003)

    Article  Google Scholar 

  2. Aliferis, C.F., Hardin, D., Massion, P.P.: Machine learning models for lung cancer classification using array comparative genomic hybridization. In: Proc. AMIA Symp. 2002, pp. 7–11 (2002)

    Google Scholar 

  3. Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Mining and Knowledge Discovery 4, 217–240 (2000)

    Article  Google Scholar 

  4. Becquet, C., Blachon, S., Jeudy, B., Boulicaut, J.F., Gandrillon, O.: Strong association rules mining for large gene expression data analysis: a case study on human SAGE data. Genome Biology 12 (2002)

    Google Scholar 

  5. Besson, J., Robardet, C., Boulicaut, J.F., Rome, S.: Constraint-based concept mining and its application to microarray data analysis. Intelligent Data Analysis 9 (to appear)

    Google Scholar 

  6. Billerey, C., Chopin, D., Aubriot-Lorton, M.H., Ricol, D., Gil Diez de Medina, S., Van Rhijn, B., Bralet, M.P., Lefrère-Belda, M.A., Lahaye, J.B., Abbou, C.C., Bonaventure, J., Zafrani, E.S., Van der Kwast, T., Thiery, J.P., Radvanyi, F.: Frequent FGFR3 mutations in papillary non-invasive bladder (pTa) tumors. Am. J. Pathol. 158, 1955–1959 (2001)

    Article  Google Scholar 

  7. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: Examiner: Optimized level-wise frequent pattern mining with monotone constraints. In: Proc. of the Third IEEE Int. Conf. on Data Mining, ICDM 2003 (2003)

    Google Scholar 

  8. Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proc. of the Fourth International Conference on Data Mining (ICDM 2004). Morgan Kaufmann, San Francisco (2004)

    Google Scholar 

  9. Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, p. 133. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Bucila, C., Gehrke, J., Kifer, D., White, W.: Dualminer: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery 7, 241–272 (2003)

    Article  MathSciNet  Google Scholar 

  11. Cappellen, D., Gil Diez de Medina, S., Chopin, D., Thiery, J.P., Radvanyi, F.: Frequent loss of heterozygosity on chromosome 10q in muscle-invasive transitional cell carcinomas of the bladder. Oncogene 14, 3059–3066 (1997)

    Article  Google Scholar 

  12. Chin, K., Ortiz de Solorzano, C., Knowles, D., Jones, A., Chou, W., Garcia Rodriguez, E., Kuo, W.L., Ljung, B.M., Chew, K., Myambo, K., Miranda, M., Krig, S., Garbe, J., Stampfer, M., Yaswen, P., Gray, J.W., Lockett, S.J.: In situ analyses of genome instability in breast cancer. Nat. Genet. 36, 984–988 (2004)

    Article  Google Scholar 

  13. Clare, A., King, R.D.: Predicting gene function in saccharomyces cerevisiae. Bioinformatics 19(suppl. 2), ii42–ii49 (2003)

    Google Scholar 

  14. Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Knowledge Discovery and Data Mining, pp. 43–52 (1999)

    Google Scholar 

  15. Hahn, W.C., Weinberg, R.A.: Modelling the molecular circuitry of cancer. Nature 2, 331–341 (2002)

    Google Scholar 

  16. Hupé, P., Stransky, N., Thiery, J.P., Radvanyi, F., Barillot, E.: Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics 12, 3413–3422 (2004)

    Article  Google Scholar 

  17. Ishkanian, A.S., Malloff, C.A., Watson, S.K., DeLeeuw, R.J., Chi, B., Coe, B.P., Snijders, A., Albertson, D.G., Pinkel, D., Marra, M.A., Ling, V., MacAulay, C., Lam, W.L.: A tiling resolution DNA microarray with complete coverage of the human genome. Nat. Genet. 36, 299–303 (2004)

    Article  Google Scholar 

  18. Kramer, S., De Raedt, L.: Feature construction with version spaces for biochemical application. In: Proc. of the 18th International Conference on Machine Learning (ICML 2001). Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  19. Li, J., Dong, G., Ramamoharao, K., Wong, L.: DeEPs: a new instance-based discovery and classification system. Machine Learning 54, 99–124 (2004)

    Article  MATH  Google Scholar 

  20. Li, J., Wong, L.: Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics 18, 725–734 (2003)

    Article  Google Scholar 

  21. Mercier, G., Berthault, N., Mary, J., Peyre, J., Antoniadis, A., Comet, J.P., Cornuejols, A., Froidevaux, C., Dutreix, M.: Biological detection of low radiation doses by combining results of two microarray analysis methods. Nucleic Acids Res. 32 (2004)

    Google Scholar 

  22. Pang, F., Cong, G., Tung, A.K.H., Yang, J., Zaki, M.: Carpenter: Finding closed patterns in long biological datasets. In: Proc. of SIGKDD 2003 (2003)

    Google Scholar 

  23. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  24. Pei, J., Han, J., Mao, R.: Closet an efficient algorithm for mining frequent closed itemsets. In: Proc. of the ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD (2000)

    Google Scholar 

  25. Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufman, San Francisco (1993)

    Google Scholar 

  26. Rioult, F., Boulicaut, J.F., Crémilleux, B., Besson, J.: Using transposition for pattern discovery from microarray data. In: Proc. of the 8th ACM SIGMODWorkshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), pp. 73–79 (2003)

    Google Scholar 

  27. Rouveirol, C., et al.: Computation of minimal recurrent gain and loss regions from array-CGH data. Extended version of JOBIM 2004 (2004) (in preparation)

    Google Scholar 

  28. Saitta, L., Zucker, J.D.: Semantic abstraction for concept representation and learning. In: Symposium on Abstraction, Reformulation and Approximation (SARA 1998), pp. 103–120 (1998)

    Google Scholar 

  29. Scheffer, T., Wrobel, S.: Finding the most interesting patterns in a database quickly by using sequential sampling. Journal of Machine Learning Research 3, 833–862 (2002)

    Article  MathSciNet  Google Scholar 

  30. Snijders, A.M., Nowak, N., Segraves, R., Blackwood, S., Brown, N., Conroy, J., Hamilton, G., Hindle, A.K., Huey, B., Kimura, K., Law, S., Myambo, K., Palmer, J., Ylstra, B., Yue, Y.P., Gray, J.W., Jain, A.N., Pinkel, D., Albertson, D.G.: Assembly of microarrays for genome-wide measurement of DNA copy number. Nat. Genet. 29, 263–264 (2001)

    Article  Google Scholar 

  31. Soulet, A., Crémilleux, B., Rioult, F.: Condensed representation of emerging patterns. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 127–132. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rouveirol, C., Radvanyi, F. (2005). Local Pattern Discovery in Array-CGH Data. In: Morik, K., Boulicaut, JF., Siebes, A. (eds) Local Pattern Detection. Lecture Notes in Computer Science(), vol 3539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11504245_9

Download citation

  • DOI: https://doi.org/10.1007/11504245_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26543-6

  • Online ISBN: 978-3-540-31894-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics