Skip to main content

Mining Contrast Inequalities in Numeric Dataset

  • Conference paper
Web-Age Information Management (WAIM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6184))

Included in the following conference series:

Abstract

Finding relational expressions which exist frequently in one class of data while not in the other class of data is an interesting work. In this paper, a relational expression of this kind is defined as a contrast inequality. Gene Expression Programming (GEP) is powerful to discover relations from data and express them in mathematical level. Hence, it is desirable to apply GEP to such mining task. The main contributions of this paper include: (1) introducing the concept of contrast inequality mining, (2) designing a two-genome chromosome structure to guarantee that each individual in GEP is a valid inequality, (3) proposing a new genetic mutation to improve the efficiency of evolving contrast inequalities, (4) presenting a GEP-based method to discover contrast inequalities, (5) giving an extensive performance study on real-world datasets. The experimental results show that the proposed methods are effective. Contrast inequalities with high discriminative power are discovered from the real-world datasets. Some potential works on contrast inequality mining are discussed.

This work was supported by the National Natural Science Foundation of China under grant No. 60773169, the 11th Five Years Key Programs for Sci. &Tech. Development of China under grant No. 2006BAI05A01, and the Young Faculty Foundation of Sichuan University under grant No. 2009SCU11030.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Zaki, M.J., Ho, C.-T. (eds.) KDD 1999. LNCS (LNAI), vol. 1759, pp. 43–52. Springer, Heidelberg (2000)

    Google Scholar 

  2. Li, J., Liu, H., Downing, J.R., Yeoh, A., Wong, L.: Simple Rules Underlying Gene Expression Profiles Using the Concept of Emerging Patterns. Bioinformatics 19, 71–78 (2003)

    Article  MATH  Google Scholar 

  3. Li, J., Dong, G., Ramamohanarao, K.: Making Use of the Most Expressive Jumping Emerging Patterns for Classification. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 220–232. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Li, J., Liu, G., Wong, L.: Mining Statistically Important Equivalence Classes and Delta-Discriminative Emerging Patterns. In: Proc. of KDD 2007, pp. 430–439 (2007)

    Google Scholar 

  5. An, A., Wan, Q., Zhao, J., Huang, X.: Diverging Patterns: Discovering Significant Frequency Change Dissimilarities in Large Databases. In: Proc. of CIKM 2007, pp. 1473–1476 (2009)

    Google Scholar 

  6. Duan, L., Tang, C., Tang, L., Zhang, T., Zuo, J.: Mining Class Contrast Functions by Gene Expression Programming. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds.) ADMA 2009. LNCS, vol. 5678, pp. 116–127. Springer, Heidelberg (2009)

    Google Scholar 

  7. Loekito, E., Bailey, J.: Fast Mining of High Dimensional Expressive Contrast Patterns Using Zero-suppressed Binary Decision Diagrams. In: Proc. of KDD 2006, pp. 307–316 (2006)

    Google Scholar 

  8. Ferreira, C.: Gene Expression Programming: A New Adaptive Algorithm for Solving Problems. Complex Systems 13(2), 87–129 (2001)

    MATH  MathSciNet  Google Scholar 

  9. Ferreira, C.: Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence. Angra do Heroismo, Portugal (2002)

    Google Scholar 

  10. Zhang, X., Dong, G., Ramamohanarao, K.: Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-dimensional Datasets. In: Proc. of KDD 2000, pp. 310–314 (2000)

    Google Scholar 

  11. Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast Algorithms for Mining Emerging Patterns. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 39–50. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  12. Fan, H., Ramamohanarao, K.: An Efficient Single-Scan Algorithm for Mining Essential Jumping Emerging Patterns for Classification. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 456–462. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Bailey, J., Manoukian, T., Ramamohanarao, K.: A Fast Algorithm for Computing Hypergraph Transversals and its Application in Mining Emerging Patterns. In: Proc. of ICDM 2003, pp. 485–488 (2003)

    Google Scholar 

  14. Bailey, J., Dong, G.: Contrast Data Mining: Methods and Applications. Tutorial at 2007 IEEE ICDM (2007)

    Google Scholar 

  15. Dong, G., Zhang, X., Wong, L., Li, J.: CAEP: Classification by Aggregating Emerging Patterns. Discovery Science, 30–42 (1999)

    Google Scholar 

  16. Li, J., Dong, G., Ramamohanarao, K., Wong, L.: DeEPs: A New Instance-Based Lazy Discovery and Classification System. Machine Learning 54(2), 99–124 (2004)

    Article  MATH  Google Scholar 

  17. Lopes, H.S., Weinert, W.R.: EGIPSYS: An Enhanced Gene Expression Programming Approach for Symbolic Regression Problems. Int’l Journal of Applied Mathematics and Computer Science 14(3), 375–384

    Google Scholar 

  18. Zhou, C., Xiao, W., Tirpak, T.M., Nelson, P.C.: Evolution Accurate and Compact Classification Rules with Gene Expression Programming. IEEE Transactions on Evolutionary Computation 7(6), 519–531 (2003)

    Article  Google Scholar 

  19. Zuo, J., Tang, C., Li, C., et al.: Time Series Prediction based on Gene Expression Programming. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 55–64. Springer, Heidelberg (2004)

    Google Scholar 

  20. Li, J., Wong, L.: Identifying Good Diagnostic Gene Groups from Gene Expression Profiles Using the Concept of Emerging Patterns. Bioinformatics 18, 725–734 (2002)

    Article  Google Scholar 

  21. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

  22. Fayyad, U., Irani, K.: Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In: Proc. of IJCAI 1993, pp. 1022–1029 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Duan, L., Zuo, J., Zhang, T., Peng, J., Gong, J. (2010). Mining Contrast Inequalities in Numeric Dataset. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14246-8_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14245-1

  • Online ISBN: 978-3-642-14246-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics