skip to main content
10.1145/1066677.1066710acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Incremental interactive mining of constrained association rules from biological annotation data with nominal features

Published:13 March 2005Publication History

ABSTRACT

Data arising from genomic and proteomic experiments is amassing at high speeds resulting in huge amounts of raw data; consequently, the need for analyzing such biological data --- the understanding of which is still lagging way behind --- has been prominently solicited in the post-genomic era we are currently witnessing. In this paper we attempt to analyze annotated genome data by applying a very central data-mining technique known as association rule mining with the aim of discovering rules capable of yielding deeper insights into this type of data. We propose a new technique capable of using domain knowledge in the form of queries in order to efficiently mine only the subset of the associations that are of interest to researcher in an incremental and interactive mode.

References

  1. R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD (Washington D.C., USA), 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal and R. Srikant, Fast algorithms for mining association rules. Proceeding of the VLDB (Santiago, Chile), 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Becquet, S. Blachon, B. Jeudy, J. F. Boulicuat, and O. Grandrillon, "Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data." Genome Biology 3(12), 2002.]]Google ScholarGoogle Scholar
  4. J. F. Boulicaut, A. Bykowski, C. Rigotti. "Free-sets: a condensed representation of Boolean data for frequency query approximation." Data Mining and Knowledge Journal 7:5--22, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Clare and R. D. King, Data mining the yeast genome in a lazy functional language. Proceedings of the International Symposium on Practical Aspects of Declarative Languages (New Orleans, Louisiana), January 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Q. Ding, M. Khan, A. Roy, and W. Perrizo, The p-tree algebra. Proceedings of the ACM SAC (Madrid, Spain), 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Geothals and J. V. D. Bussche, Interactive Constrained Association Rule Mining. Proceedings of the International Conference on Data Warehousing and Knowledge Discovery, volume 1874 of Lecture Notes in Computer Science. Springer, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Han, J. Pei and Y. Yin, Mining Frequent Patterns without Candidate Generation. Proceeding of ACM SIGMOD (Dallas, Texas), 1--12, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Icev, C. Ruiz, and E. F. Ryder, Distance-Enhanced Association Rules fro Gene Expression. Proceedings of the ACM SIGKDD BIOKDD, Workshop on Data Mining in Bioinformatics (Washington D. C., USA), July 2002.]]Google ScholarGoogle Scholar
  10. P. Kotala, P. Zhou, S. Mudivarthy, W. Perrizo and E. Deckard, Gene Expression Profiling of DNA Microarray Data using Peano Count Trees. Online proceedings of the first annual Virtual Conference on Genomics and Bioinformatics, October 2001.]]Google ScholarGoogle Scholar
  11. Munich Information Center for Protein Sequences. {http://mips.gsf.de/}. August 2004.]]Google ScholarGoogle Scholar
  12. W. Perrizo, Peano count tree technology lab notes. Technical Report NDSU-CS-TR-01-1, 2001. {http://www.cs.ndsu.nodak.edu/~perrizo/classes/785/pct.html }. January 2003.]]Google ScholarGoogle Scholar
  13. I. Rahal, D. Ren, and W. Perrizo, "A Scalable Vertical Model for Mining Association Rules." To appear in the Journal of Information & Knowledge Management (JIKM) by World Scientific, December 2004 issue.]]Google ScholarGoogle ScholarCross RefCross Ref
  14. P. Shenoy, J. Haristsa, S. Sudatsham, G. Bhalotia, M. Baqa and D. Shah, Turbo-charging vertical mining of large databases. Proceedings of the ACM SIGMOD (Austin, Texas), 22--29, May 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Tuzhilin and G. Adomavicius, Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. Proceedings of the ACM SIGKDD (Edmonton, Alberta), July 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. D. Williams, G. D. Pavitt, and C. G. Proud, "Characterization of the initiation factor eIF2B and its regulation in Drosophila melanogaster." Journal of Biological Chemistry, 276(6): 3733--3742, February 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  17. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, New Algorithms for Fast Discovery of Association Rules. Proceedings of the SIGKDD (Newport, California), 283--286, August 1997.]]Google ScholarGoogle Scholar

Index Terms

  1. Incremental interactive mining of constrained association rules from biological annotation data with nominal features

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '05: Proceedings of the 2005 ACM symposium on Applied computing
      March 2005
      1814 pages
      ISBN:1581139640
      DOI:10.1145/1066677

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 March 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,650of6,669submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader