Skip to main content

Towards Characterization of the Data Generation Process

  • Chapter
  • 883 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 169))

Abstract

Data Mining applications have found interesting applications in commercial and scientific domains. Last two decades have seen rapid strides in development of elegant algorithms that induce useful predictive and descriptive models from large data repositories available widely.

In last decade serious effort has also been made towards mining of evolving data-sets and now several one pass algorithms with restricted memory footprints are available for use in data stream environments. Study of temporal evolution of the patterns has been recognized as an important next generation data mining problem by both - the research and user communities. Comparative analyses of the changes detected in the discovered trends over the temporal dimension are likely to provide an insight into the dynamics of the dgp. Different levels of abstractions from the end-user’s viewpoint, form the second dimension of such analyses.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C.: A framework for diagnosing changes in evolving data streams. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 575–586. ACM Press, New York (2003)

    Chapter  Google Scholar 

  2. Aggarwal, C.C.: Data Streams: Models and Algorithms (Advances in Database Systems). Springer, New York (2006)

    Google Scholar 

  3. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings Of International Conference On Very Large Data Bases (2003)

    Google Scholar 

  4. Tsymbal, A.: The Problem of Concept Drift: Definition and Related Work, www.cs.tcd.ie/publications/tech-report/reports.04/TCD-CS-2004-15.pdf

  5. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS 2002: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1–16. ACM, New York (2002)

    Chapter  Google Scholar 

  6. JohannesGehrke Ben-David, S., Kifer, D.: Detecting change in data streams. In: Proceedings Of International Conference On Very Large Data Bases (2004)

    Google Scholar 

  7. Bhatnagar, V.: Intension Mining: A New Approach to Knowledge Discovery in Databases. PhD thesis, JMI, New Delhi, India (2001)

    Google Scholar 

  8. Bhatnagar, V., Kochhar, S.: User subjectivity in change modeling of streaming itemsets. In: Proceedings Of 1st International Conference On Advanced Data Mining Applications (July 2005)

    Google Scholar 

  9. Bhatnagar, V., Kochhar, S.: Modeling support changes in streaming item sets. International Journal of Systems Science 37(13/20), 879–891 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  10. Bhatnagar, V., Kaur, S.: Exclusive and complete clustering of streams. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 629–638. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Bhatnagar, V., Kochhar, S.: Beyond mining: Characterizing the data generation process. In: Proceedings of Seventh International Conference on Intelligent Systems Design and Applications, ISDA 2007, Rio de Janeiro, Brazil, pp. 491–496 (2007)

    Google Scholar 

  12. Baron, S., Spiliopoulou, M.: Monitoring the evolution of web usage patterns. In: EWMF, pp. 181–200 (2003)

    Google Scholar 

  13. Chen, M.-C., Chiu, A.-L., Chang, H.-H.: Mining changes in customer behavior in retail marketing. Expert System Appllications 28(4), 773–781 (2005)

    Article  Google Scholar 

  14. Cormode, G., Muthukrishnan, S.: What is new: Finding significant differences in network data streams. In: Proceedings of INFOCOM (2004)

    Google Scholar 

  15. Domingos, P., Hulten, G.: Catching up with the data: Research issues in mining data streams. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2001)

    Google Scholar 

  16. Dong, G., Han, J., Lakshmanan, L.V.S., Pei, J., Wang, H., Yu, P.S.: Online mining of changes from data streams: Research problems and preliminary results. In: Proceedings of ACM SIGMOD (2003)

    Google Scholar 

  17. Delhi university library system, http://crl.du.ac.in

  18. Fan, W.: Streamminer: A classifier ensemble-based engine to mine concept drifting data streams. In: Proceedings of International Conference on Very Large Data Bases (2004)

    Google Scholar 

  19. Fan, W., Huang, Y., Wang, H., Yu, P.S.: Active mining of data streams. In: Proceedings of International Conference SIAM (2004)

    Google Scholar 

  20. Repository of the 1st international workshop on frequent itemset mining implementations (fimi 2003). In: IEEE ICDM 2003, Melbourne, Florida, USA (2003)

    Google Scholar 

  21. Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery: An overview. In: Advances in Knowledge Discovery and Data Mining, pp. 1–34 (1996)

    Google Scholar 

  22. Gaber, M.M.: Mining data streams bibliography, http://www.csse.monash.edu.au/~mgaber/WResources.htm

  23. Gupta, S.K., Bhatnagar, V., Wasan, S.K.: Architecture for knowledge discovery and knowledge management. Knowl. and Inf. Syst. 7(3), 310–336 (2005)

    Article  Google Scholar 

  24. Ganti, V., Gehrke, J., Ramakrishnan, R.: Demon: Mining and monitoring evolving data. In: ICDE, pp. 439–448 (2000)

    Google Scholar 

  25. Ganti, V., Gehrke, J., Ramakrishnan, R., Loh, W.-Y.: Focus: A framework for measuring differences in data characterstics. In: Proceedings of PODS (1999)

    Google Scholar 

  26. Geurts, K., Wets, G., Brijs, T., Vanhoof, K.: Profiling high frequency accident locations using association rules (2003)

    Google Scholar 

  27. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of International. Conf. SIGMOD 2000 (May 2000)

    Google Scholar 

  28. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the ACM SIGKDD (2001)

    Google Scholar 

  29. Medhat, M., Gama, G.J. (eds.): Learning from Data Streams. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  30. Liu, B., Hsu, W., Han, H.-S., Xia, Y.: Mining changes for real-life applications. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000, vol. 1874, p. 337. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  31. Liu, B., Hsu, W., Ma, Y.: Mining association rules with multiple minimum supports. In: KDD 1999: Proceedings Of The Fifth ACM SIGKDD International Conference On Knowledge Discovery And Data Mining, pp. 337–341. ACM, New York (1999)

    Chapter  Google Scholar 

  32. Padmanabhan, B.: Unexpectedness as a measure of interestingness in knowledge discovery, IS-97-06 (1997)

    Google Scholar 

  33. Piatetsky-Shapiro, G., Mathens, C.J.: The interestingness of deviations. In: Proceedings of the AAAI 1994 Workshop on Knowledge Discovery in Databases, pp. 25–36 (1994)

    Google Scholar 

  34. Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: Monic: Modeling and monitoring cluster transitions. In: KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 706–711. ACM, New York (2006)

    Chapter  Google Scholar 

  35. Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE Trans. on Knowl. and Data Eng. 8(6), 970–974 (1996)

    Article  Google Scholar 

  36. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept drifting data streams using ensemble classifiers. In: Proceedings of the ACM SIGKDD (2003)

    Google Scholar 

  37. Wang, K., Zhou, S., Fu, A., Yu, J.: Mining changes of classification by correspondence tracing. In: Proceedings of International SIAM Data Mining conf. (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bhatnagar, V., Kochhar, S. (2009). Towards Characterization of the Data Generation Process. In: Nedjah, N., de Macedo Mourelle, L., Kacprzyk, J. (eds) Innovative Applications in Data Mining. Studies in Computational Intelligence, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88045-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88045-5_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88044-8

  • Online ISBN: 978-3-540-88045-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics