Skip to main content

Pattern Detection and Discovery

  • Conference paper
  • First Online:
Book cover Pattern Detection and Discovery

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2447))

Abstract

Data mining comprises two subdisciplines. One of these is based on statistical modelling, though the large data sets associated with data mining lead to new problems for traditional modelling methodology. The other, which we term pattern detection, is a new science. Pattern detection is concerned with defining and detecting local anomalies within large data sets, and tools and methods have been developed in parallel by several applications communities, typically with no awareness of developments elsewhere. Most of the work to date has focussed on the development of practical methodology, with little attention being paid to the development of an underlying theoretical base to parallel the theoretical base developed over the last century to underpin modelling approaches. W e suggest that the time is now right for the development of a theoretical base, so that important common aspects of the work can be identified, so that key directions for future research can be characterised, and so that the various different application domains can benefit from the work in other areas. We attempt describe a unified approach to the subject, and also attempt to provide theoretical base on which future developments can stand.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Grenander U.: General Pattern Theory: a Mathematical Study of Regular Structures. Clarendon Press, Oxford (1993)

    Google Scholar 

  2. Klösgen, W.: Subgroup patterns. In: Klösgen, W., Zytkow, J.M. (eds.): Handbook of data mining and knowledge discovery. Oxford University Press, New York (1999)

    Google Scholar 

  3. Friedman, J.H., Fisher, N.I.: Bump hunting in high-dimensional data. Statistics and Computing 9(2) (1999) 1–20

    Article  Google Scholar 

  4. Hand D.J., Blunt G., Kelly M.G., Adams N.M.: Data mining for fun and profit. Statistical Science 15 (2000) 111–131

    Article  Google Scholar 

  5. Hand D.J., Mannila H., Smyth P.: Principles of Data Mining. MIT Press (2001)

    Google Scholar 

  6. Chau T., Wong A.K.C.: Pattern discovery by residual analysis and recursive partitioning. IEEE Transactions on Knowledge and Data Engineering 11 (1999) 833–852

    Article  Google Scholar 

  7. Adams N.M., Hand D.J., Till, R.J.: Mining for classes and patterns in behavioural data.Journal of the Operational Research Society 52 (2001) 1017–1024

    Article  MATH  Google Scholar 

  8. Bolton R.J., Hand D.J.: Significance tests for patterns in continuous data. In: Proceedings of the IEEE International Conference on Data Mining, San Jose, CA. Springer-Verlag (2001)

    Google Scholar 

  9. Edwards R.D., Magee F.: Technical Analysis of Stock Trends. 7th edn. AMACOM, New York (1997)

    Google Scholar 

  10. Jobman D.R.: The Handbook of Technical Analysis. Probus Publishing Co. (1995)

    Google Scholar 

  11. Zembowicz R., Zytkow J.: From contingency tables to various forms of knowledge in databases. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining, Menlo Park, California, AAAI Press (1996) 329–349

    Google Scholar 

  12. Liu B., Hsu W., Ma Y.: Pruning and summarizing the discovered associations. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, ACM Press (1999) 125–134

    Google Scholar 

  13. DuMouchel, W.: Bayesian data mining in large frequency tables, with an application to the FDA Spontaneous Reporting System. The American Statistician 53 (1999) 177–202

    Article  Google Scholar 

  14. Jelinek F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge, Massachusetts (1997)

    Google Scholar 

  15. Sinha S., Tompa M.: A statistical method for finding transcription factor binding sites.In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, La Jolla, CA, AAAI Press (2000) 344–354

    Google Scholar 

  16. Chudova, D., Smyth, P.: Unsupervised identification of sequential patterns under a Markov assumption. In: Proceedings of the KDD 2001 Workshop on Temporal Data Mining, San Francisco, CA (2001)

    Google Scholar 

  17. Durbin R., Eddy S., Krogh A., Mitchison G.: Biological Sequence Analysis. Cambridge University Press: Cambridge (1998)

    MATH  Google Scholar 

  18. Hand D.J., Bolton R.J.: Pattern detection in data mining. Technical Report, Department of Mathematics, Imperial College, London (2002)

    Google Scholar 

  19. Dong G., Li J.: Interestingness of discovered association rules in terms of neighbourhood-based unexpectedness.In: Proceedings of the Pacific Asia Conference on Knowledge Discovery in Databases (PAKDD), Lecture Notes in Computer Science, Vol.1394., Springer-Verlag, Berlin Heidelberg New York (1998) 72–86

    Google Scholar 

  20. Toivonen H., Klemettinen M., Ronkainen P., Hätönen, Mannila H.: Pruning and grouping discovered association rules.In: Mlnet Workshop on Statistics, Machine Learning, and Discovery in Databases, Crete, Greece, MLnet (1995) 47–52

    Google Scholar 

  21. Brin S., Motwani R., Ullma J.D., Tsur S.: Dynamic itemset counting and implication rules for market basket data.In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, ACM Press (1997) 255–264

    Google Scholar 

  22. Miller R.G.: Simultaneous Statistical Inference. 2nd ed. Springer-Verlag, New York (1981)

    MATH  Google Scholar 

  23. Pigeot I.: Basic concepts of multiple tests-a survey. Statistical Papers 41 (2000) 3–36

    Article  MATH  MathSciNet  Google Scholar 

  24. Benjamini Y., Hochberg Y.: Controlling the false discovery rate. Journal of the Royal Statistical Society, Series B 57 (1995) 289–300

    MATH  MathSciNet  Google Scholar 

  25. Bolton R.J., Hand D.J., Adams, N.: Determining hit rate in pattern search. In: These Proceedings (2002)

    Google Scholar 

  26. Berry M.J.A., Lino. G.: Mastering data mining. The art and science of customer relationship management. Wiley, New York (2000)

    Google Scholar 

  27. Brunskill A.J.: Some sources of error in the coding of birth weight. American Journal of Public Health 80 (1990) 72–3

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hand, D.J. (2002). Pattern Detection and Discovery. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds) Pattern Detection and Discovery. Lecture Notes in Computer Science(), vol 2447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45728-3_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-45728-3_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44148-9

  • Online ISBN: 978-3-540-45728-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics