Machine learning for event selection in high energy physics

https://doi.org/10.1016/j.engappai.2009.05.004Get rights and content

Abstract

The field of high energy physics aims to discover the underlying structure of matter by searching for and studying exotic particles, such as the top quark and Higgs boson, produced in collisions at modern accelerators. Since such accelerators are extraordinarily expensive, extracting maximal information from the resulting data is essential. However, most accelerator events do not produce particles of interest, so making effective measurements requires event selection, in which events producing particles of interest (signal) are separated from events producing other particles (background). This article studies the use of machine learning to aid event selection. First, we apply supervised learning methods, which have succeeded previously in similar tasks. However, they are suboptimal in this case because they assume that the selector with the highest classification accuracy will yield the best final analysis; this is not true in practice, as such analyses are more sensitive to some backgrounds than others. Second, we present a new approach that uses stochastic optimization techniques to directly search for selectors that maximize either the precision of top quark mass measurements or the sensitivity to the presence of the Higgs boson. Empirical results confirm that stochastically optimized selectors result in substantially better analyses. We also describe a case study in which the best selector is applied to real data from the Fermilab Tevatron accelerator, resulting in the most precise top quark mass measurement of this type to date. Hence, this new approach to event selection has already contributed to our knowledge of the top quark's mass and our understanding of the larger questions upon which it sheds light.

Introduction

The field of high energy physics is devoted to the study of the elementary constituents of matter. By investigating the structure of matter and the laws that govern its interactions, this field strives to discover the fundamental properties of the physical universe. In experimental high energy physics, the goal is to test predictions made by current theories such as the Standard Model (Weinberg, 1967, Glashow, 1961, Yao et al., 2006), which describes the behavior of three of the four fundamental forces.

The primary tools of experimental high energy physicists are modern accelerators, which collide protons and/or anti-protons to create exotic particles that occur only at extremely high energy densities. Such particles have not existed naturally since the first moments after the Big Bang, when the energy density of the universe was much higher. Observing these particles and measuring their properties may yield critical insights about the very nature of mass.

Two particles of particular interest are the top quark and the Higgs boson. The top quark, first observed in 1995 (Abe et al., 1995, Abott et al., 1995), is nearly as massive as a gold nucleus, making it by far the most massive subatomic particle ever observed. The top quark is important because precise measurements of its mass can stringently test theories about the origins of particle mass (Hashimoto et al., 2001, Heinemeyer, 2003, The LEP Collaboration, 2004, Miransky et al., 1989). Only the world's most powerful accelerator, the Fermilab Tevatron in Batavia, Illinois, has sufficient energy to produce top quarks. By contrast, the Higgs boson (Higgs, 1966) has never been observed. In fact, it is the only remaining particle predicted by the standard model whose existence has not been experimentally verified (Yao et al., 2006). Since the Higgs boson is theorized to give mass to other particles through its interactions (Kado and Tully, 2002), it is central to current theories about particle mass. Hence, observing the Higgs boson is a paramount goal in high energy physics.

Producing and observing such particles require extraordinary resources. The Tevatron accelerator and its particle detectors cost billions of dollars to construct and approximately a million dollars per day to operate. As a result, extracting maximal information from the resulting data is essential. In this article, we study the use of machine learning methods to aid this process. In particular, we investigate their efficacy for event selection for top quark mass measurement and Higgs boson search.

In an accelerator event, protons and/or anti-protons are accelerated and annihilated. The resulting energy causes new particles to form, which can be observed via detectors that surround the point of collision. However, the vast majority of events do not produce particles of interest, such as the top quark or Higgs boson. For example, though the Tevatron produces approximately 1010 events per hour, approximately one results in a top quark, on average. Therefore, good data analysis depends on effective event selection, in which events producing particles of interest (signal) are separated from those producing other particles (background). Event selection is difficult because several types of background can mimic the signal's characteristic signature. Hence, event selection in high energy physics is an exciting challenge for machine learning. In this article, we compare two different approaches to this problem.

The first approach is based on supervised learning methods, which are used to train classifiers that distinguish signal from background. Such methods have already proven successful in similar event selection problems by training neural networks (Abazov et al., 2001, Acosta et al., 2005) or support vector machines (Whiteson and Naumann, 2003) to classify events as signal or background. This supervised approach is most effective in the narrow class of problems in which the classification accuracy of the event selector is closely correlated with the quality of the resulting data analysis and systematic uncertainties are minimal. However, top quark mass measurement and Higgs boson search exemplify a broader class of problems where higher classification accuracy does not necessarily result in superior analysis performance. Instead, the top quark mass measurement is more sensitive to the presence of some background events than others, in ways that are difficult to predict a priori. The Higgs boson search is limited by systematic uncertainties on the background events. Therefore, selectors that maximize classification accuracy may perform worse than those that (1) increase the quantity of signal by tolerating harmless background, (2) reduce the quantity of signal to eliminate disruptive background, or (3) minimize the impact of systematic uncertainties.

To find such selectors, we introduce a second, novel approach that uses stochastic optimization techniques. Rather than maximizing classification accuracy, this approach directly optimizes selectors for their true purpose: maximizing either the precision of top quark mass measurements or the sensitivity to the presence of the Higgs boson. Using NeuroEvolution of Augmenting Topologies (NEAT) (Stanley and Miikkulainen, 2002), an evolutionary method for training neural networks, we optimize event selectors that operate either in conjunction with supervised classifiers or in lieu of them.

This article presents experiments that compare the performance of manually designed heuristic selectors to neural network selectors trained with backpropagation (Rumelhart et al., 1986) or NEAT. In both top quark mass measurement and Higgs boson search, the learning methods perform significantly better than the heuristic approach, confirming that machine learning can greatly benefit event selection in high energy physics. Furthermore, the NEAT selectors yield by far the best analyses, demonstrating the advantage of the stochastic optimization approach in an application area previously assumed the province of supervised methods.

Finally, this article describes a detailed case study in which the best performing selector is applied to real data gathered by the CDF II detector at the Tevatron. The result is a substantial reduction in uncertainty in the top quark mass measurement, yielding by far the most precise measurement of this type to date. Obtaining a similar reduction in uncertainty would otherwise require producing many more collisions at great expense. Hence, this new approach to event selection has already contributed substantially to our knowledge of the top quark's mass and our understanding of the larger questions upon which it sheds light.

The approaches we propose also offer potential benefits beyond the analysis of data gathered at the Tevatron. The future of high energy physics lies with the large hadron collider (LHC), a new accelerator currently under construction. Once in operation, the LHC will produce collisions at much higher frequency and energy than the Tevratron. The resulting torrent of data will require highly effective event selection, which can potentially be aided by the methods presented in this article.

The remainder of this paper is organized as follows. Section 2 overviews the process of producing, detecting, selecting, and analyzing events in modern high energy accelerators. Sections 3 and 4 describe methods for performing event selection with the aid of supervised learning and stochastic optimization, respectively. Sections 5 and 6 compare the performance of these methods on the problems of top quark mass measurement and Higgs boson search, respectively. Section 7 describes the use of our optimized selector to produce a new top quark mass measurement with data from the Tevatron. Section 8 discusses the the implications of these results, Section 9 outlines our plans for future work, and Section 10 concludes.

Section snippets

Events in high energy physics

This section provides an overview of the process of producing, detecting, selecting, and analyzing events at modern high energy accelerators, with particular focus on measurement of the top quark's mass and the search for the Higgs boson.

Supervised learning for event selection

This section describes how final event selection can be performed with the aid of supervised learning methods which maximize classification accuracy. In a narrow class of problems, these are equivalent to a likelihood ratio test, which the Neyman–Pearson Lemma (Neyman and Pearson, 1933) suggests that are optimal. This approach is standard in the physics community and serves as a point of comparison for the optimization approach that will be described in Section 4.

Supervised methods can be used

Optimization methods for event selection

The supervised approach described above is most effective in the narrow class of problems in which the classification accuracy of the event selector is closely correlated with the quality of the resulting data analysis. In this case, the Neyman–Pearson Lemma argues that such techniques are optimal.

However, top quark mass measurement and Higgs boson search exemplify a broader class of problems where higher classification accuracy does not necessarily result in better analysis. Instead, the

Comparative results for top quark mass measurement

To assess the efficacy of the methods presented in Sections 3 and 4, we evaluated each one on the top quark mass measurement problem, averaging performance over 10 independent runs, using 10,000 simulated events. These runs were conducted using 10-fold cross validation: in each run, 75% of the events are selected at random for training and the remaining 25% reserved for testing.

Comparative results for Higgs boson search

We also evaluated the relative performance of the supervised learning and optimization strategies on the problem of Higgs boson search. As before, each run was conducted using 10-fold cross validation with a 75%/25% split between training and testing.

Application to Tevatron events

The top quark was first observed in 1995 at the Fermilab Tevatron, which remains the sole accelerator with enough energy to produce it for direct study. The top quark's mass is extraordinary on the scale of fundamental particles, nearly 40 times larger than the second most massive quark and comparable to the mass of a gold atom. Its enormous mass remains a puzzle from a theoretical standpoint, so physicists have focused great effort on probing the precious few top quarks collected in the five

Discussion

The results presented in Sections 5 and 6 confirm the conclusion of earlier work (Abazov et al., 2001, Acosta et al., 2005, Whiteson and Naumann, 2003) that machine learning methods can substantially outperform heuristic event selectors. However, previous results demonstrated only that learned selectors had higher classification accuracy, while these results directly verify that they can improve the quality of the resulting analysis. More importantly, these results confirm the advantage of

Future work

We believe that the intersection of machine learning and high energy physics is a highly promising but under-explored research area. Thus, we hope that the results presented in this article will mark only the beginning of a long and fruitful effort to bridge the gap between these two disciplines. Such research can be a boon to both fields, by providing machine learning researchers with a challenging, realistic proving ground for their methods and arming high energy physicists with the tools to

Conclusion

This article describes the use of machine learning methods to aid the process of selecting events at high energy accelerators for the purpose of studying the fundamental nature of matter and its interactions. First, we apply supervised learning methods, which have succeeded previously in similar tasks. Second, we present a new approach that uses stochastic optimization techniques to directly search for selectors that maximize either the precision of top quark mass measurements or the

Acknowledgments

The authors thank Peter Stone, Risto Miikkulainen, Ken Stanley, Razvan Bunescu, Misha Bilenko, and Gwenn Englebienne for their comments and suggestions on this research.

References (52)

  • B. Abott

    Observation of the top quark

    Physical Review Letters

    (1995)
  • A. Abulencia

    Measurement of the tt¯ production cross section in pp¯ collisions at s=1.96TeV using dilepton events

    Physical Review Letters

    (2004)
  • A. Abulencia

    Measurement of the J/ψ meson and b hadron production cross sections in pp¯ collisions at s=1960GeV

    Physical Review D

    (2005)
  • A. Abulencia

    Top quark mass measurement from dilepton events at CDF II

    Physical Review Letters

    (2005)
  • D. Acosta

    Measurement of the cross section for tt¯ production in pp¯ collisions using the kinematics of lepton+jets events

    Physical Review D

    (2005)
  • Brubaker, E. 2004. Ph.D. Thesis, University of California,...
  • A. Cho

    Aging atom smasher runs all out in race for most coveted particle

    Science

    (2006)
  • C. Cortes et al.

    Support-vector networks

    Machine Learning

    (1995)
  • P. Domingos et al.

    On the optimality of the simple Bayesian classifier under zero–one loss

    Machine Learning

    (1997)
  • Domingos, P., 1999. Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM...
  • Elkan, C., 2001. The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint...
  • Estrada, J., 2001. Ph.D. Thesis, University of...
  • Fawcett, T., 1993. Feature discovery for problem solving systems. Ph.D. Thesis, University of Massassachusetts,...
  • Getoor, L., Taskar, B. (Eds.), 2007. Introduction to Relational Statistical Learning. MIT Press, Cambridge,...
  • Gomez, F., Schmidhuber, J., Miikkulainen, R., 2006. Efficient non-linear control through neuroevolution. In:...
  • M. Hashimoto et al.

    Top mode standard model with extra dimensions

    Physical Review D

    (2001)
  • Cited by (0)

    View full text