Skip to main content

Frequent Variable Sets Based Clustering for Artificial Neural Networks Particle Classification

  • Conference paper
Advances in Data and Web Management (APWeb 2007, WAIM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4505))

  • 1136 Accesses

Abstract

Particle classification is one of the major analyses in high-energy particle physics experiments. We design a classification framework combining classification and clustering for particle physics experiments data. The system involves classification by a set of Artificial Neural Networks (ANN); each using distinct subsets of samples selected from the general set. We use frequent variable sets based clustering for partitioning the train samples into several natural subsets, then standard back-propagation ANNs are trained on them. The final decision for each test case is a two-step process. First, the nearest cluster is found for the case, and then the decision is based on the ANN classifier trained on the specific cluster. Comparisons with other classification and clustering methods show that our method is promising.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rennie, J.D., et al.: Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: Twentieth International Conference on Machine Learning, August 22 (2003)

    Google Scholar 

  2. Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)

    Google Scholar 

  3. Funahashi, K.-i.: On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks 2(3), 183–192 (1989)

    Article  Google Scholar 

  4. Haykin, S.: Neural Networks - A Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Feature Extraction Through LOCOCODE. Neural Computation 11(3), 679–714 (1999)

    Article  Google Scholar 

  6. Hornik, K., Stinchcombe, M., White, H.: Multilayer Feedforward Networks are Universal Approximators. Neural Networks 2(5), 359–366 (1989)

    Article  Google Scholar 

  7. KDD Cup (2004), http://kodiak.cs.cornell.edu/kddcup/index.html

  8. Hipp, J., Guntzer, U., Nakhaeizadeh, G.: Algorithms for Association Rule Mining – a General Survey and Comparison. ACM SIGKDD Explorations 2, 58–64 (2000)

    Article  Google Scholar 

  9. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. of ACM SIGMOD’00 (2000)

    Google Scholar 

  10. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proc. VLDB 94, Santiago de Chile, Chile, pp. 487–499 (1994)

    Google Scholar 

  11. Kunze, M.: Application of Artificial Neural Networks in the Analysis of Multi-Particle Data. In: The Proceedings of the CORINNEII Conference (1994)

    Google Scholar 

  12. KDD Cup 2004 – Description of Performance Metrics(2006), http://kodiak.cs.cornell.edu/kddcup/metrics.html

  13. Statnikov, A., Aliferis, C.F., Tsamardinos, I., Hardin, D.P., Levy, S.: A Comprehensive Evaluation of Multicategory Classification Methods for Microarray Gene Expression Cancer Diagnosis. Bioinformatics (2004)

    Google Scholar 

  14. Hipp, J., Guntzer, U., Nakhaeizadeh, G.: Algorithms for Association Rule Mining - a General Survey and Comparison. ACM SIGKDD Explorations 2(1), 58–64 (2000)

    Article  Google Scholar 

  15. Fung, B., Wang, K., Ester, M.: Large Hierarchical Document Clustering Using Frequent Itemsets. In: Proc. SIAM International Conference on Data Mining 2003 (SDM ‘2003), San Francisco, CA (May 2003)

    Google Scholar 

  16. Beil, F., Ester, M., Xu, X.: Frequent Term-based Text Clustering. In: KDD, pp. 436–442 (2002)

    Google Scholar 

  17. Aha, D., Kibler, D.: Instance-based Learning Algorithms. Machine Learning 6, 37–66 (1991)

    Google Scholar 

  18. Witten, I., Frank, E.: Data Mining –Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  19. Dubes, R.C., Jain, A.K.: Algorithms for Clustering Data. Prentice Hall College Div., Englewood Cliffs (March 1998)

    Google Scholar 

  20. Schneider, K.-M.: A Comparison of Event Models for Naive Bayes Anti-Spam E-Mail Filtering. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, April 2003, pp. 307–314 (2003)

    Google Scholar 

  21. Jin, X., Xu, A., Bie, R., Guo, P.: Kernel Independent Component Analysis for Gene Expression Data Clustering. In: Rosca, J.P., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds.) ICA 2006. LNCS, vol. 3889, pp. 454–461. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  22. Aha, D., Kibler, D.: Instance-based Learning Algorithms. Machine Learning 6, 37–66 (1991)

    Google Scholar 

  23. Indyk, P.: Nearest Neighbors in High-dimensional Spaces. In: Goodman, J.E., O’Rourke, J. (eds.) Handbook of Discrete and Computational Geometry, 2nd edn., CRC Press, Boca Raton (2004)

    Google Scholar 

  24. John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)

    Google Scholar 

  25. Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naive Bayes Classification. In: CDM 2004, pp. 51–58 (2004)

    Google Scholar 

  26. Flach, P.A., Lachiche, N.: Naive Bayesian Classification of Structured Data. Machine Learning 57(3), 233–269 (2004)

    Article  MATH  Google Scholar 

  27. Wang, H., et al.: Clustering by Pattern Similarity in Large Data sets. In: SIGMOD, pp. 394–405 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Guozhu Dong Xuemin Lin Wei Wang Yun Yang Jeffrey Xu Yu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Jin, X., Bie, R. (2007). Frequent Variable Sets Based Clustering for Artificial Neural Networks Particle Classification. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds) Advances in Data and Web Management. APWeb WAIM 2007 2007. Lecture Notes in Computer Science, vol 4505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72524-4_88

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72524-4_88

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72483-4

  • Online ISBN: 978-3-540-72524-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics