Engineering Applications of Artificial Intelligence
Machine learning for event selection in high energy physics
Introduction
The field of high energy physics is devoted to the study of the elementary constituents of matter. By investigating the structure of matter and the laws that govern its interactions, this field strives to discover the fundamental properties of the physical universe. In experimental high energy physics, the goal is to test predictions made by current theories such as the Standard Model (Weinberg, 1967, Glashow, 1961, Yao et al., 2006), which describes the behavior of three of the four fundamental forces.
The primary tools of experimental high energy physicists are modern accelerators, which collide protons and/or anti-protons to create exotic particles that occur only at extremely high energy densities. Such particles have not existed naturally since the first moments after the Big Bang, when the energy density of the universe was much higher. Observing these particles and measuring their properties may yield critical insights about the very nature of mass.
Two particles of particular interest are the top quark and the Higgs boson. The top quark, first observed in 1995 (Abe et al., 1995, Abott et al., 1995), is nearly as massive as a gold nucleus, making it by far the most massive subatomic particle ever observed. The top quark is important because precise measurements of its mass can stringently test theories about the origins of particle mass (Hashimoto et al., 2001, Heinemeyer, 2003, The LEP Collaboration, 2004, Miransky et al., 1989). Only the world's most powerful accelerator, the Fermilab Tevatron in Batavia, Illinois, has sufficient energy to produce top quarks. By contrast, the Higgs boson (Higgs, 1966) has never been observed. In fact, it is the only remaining particle predicted by the standard model whose existence has not been experimentally verified (Yao et al., 2006). Since the Higgs boson is theorized to give mass to other particles through its interactions (Kado and Tully, 2002), it is central to current theories about particle mass. Hence, observing the Higgs boson is a paramount goal in high energy physics.
Producing and observing such particles require extraordinary resources. The Tevatron accelerator and its particle detectors cost billions of dollars to construct and approximately a million dollars per day to operate. As a result, extracting maximal information from the resulting data is essential. In this article, we study the use of machine learning methods to aid this process. In particular, we investigate their efficacy for event selection for top quark mass measurement and Higgs boson search.
In an accelerator event, protons and/or anti-protons are accelerated and annihilated. The resulting energy causes new particles to form, which can be observed via detectors that surround the point of collision. However, the vast majority of events do not produce particles of interest, such as the top quark or Higgs boson. For example, though the Tevatron produces approximately events per hour, approximately one results in a top quark, on average. Therefore, good data analysis depends on effective event selection, in which events producing particles of interest (signal) are separated from those producing other particles (background). Event selection is difficult because several types of background can mimic the signal's characteristic signature. Hence, event selection in high energy physics is an exciting challenge for machine learning. In this article, we compare two different approaches to this problem.
The first approach is based on supervised learning methods, which are used to train classifiers that distinguish signal from background. Such methods have already proven successful in similar event selection problems by training neural networks (Abazov et al., 2001, Acosta et al., 2005) or support vector machines (Whiteson and Naumann, 2003) to classify events as signal or background. This supervised approach is most effective in the narrow class of problems in which the classification accuracy of the event selector is closely correlated with the quality of the resulting data analysis and systematic uncertainties are minimal. However, top quark mass measurement and Higgs boson search exemplify a broader class of problems where higher classification accuracy does not necessarily result in superior analysis performance. Instead, the top quark mass measurement is more sensitive to the presence of some background events than others, in ways that are difficult to predict a priori. The Higgs boson search is limited by systematic uncertainties on the background events. Therefore, selectors that maximize classification accuracy may perform worse than those that (1) increase the quantity of signal by tolerating harmless background, (2) reduce the quantity of signal to eliminate disruptive background, or (3) minimize the impact of systematic uncertainties.
To find such selectors, we introduce a second, novel approach that uses stochastic optimization techniques. Rather than maximizing classification accuracy, this approach directly optimizes selectors for their true purpose: maximizing either the precision of top quark mass measurements or the sensitivity to the presence of the Higgs boson. Using NeuroEvolution of Augmenting Topologies (NEAT) (Stanley and Miikkulainen, 2002), an evolutionary method for training neural networks, we optimize event selectors that operate either in conjunction with supervised classifiers or in lieu of them.
This article presents experiments that compare the performance of manually designed heuristic selectors to neural network selectors trained with backpropagation (Rumelhart et al., 1986) or NEAT. In both top quark mass measurement and Higgs boson search, the learning methods perform significantly better than the heuristic approach, confirming that machine learning can greatly benefit event selection in high energy physics. Furthermore, the NEAT selectors yield by far the best analyses, demonstrating the advantage of the stochastic optimization approach in an application area previously assumed the province of supervised methods.
Finally, this article describes a detailed case study in which the best performing selector is applied to real data gathered by the CDF II detector at the Tevatron. The result is a substantial reduction in uncertainty in the top quark mass measurement, yielding by far the most precise measurement of this type to date. Obtaining a similar reduction in uncertainty would otherwise require producing many more collisions at great expense. Hence, this new approach to event selection has already contributed substantially to our knowledge of the top quark's mass and our understanding of the larger questions upon which it sheds light.
The approaches we propose also offer potential benefits beyond the analysis of data gathered at the Tevatron. The future of high energy physics lies with the large hadron collider (LHC), a new accelerator currently under construction. Once in operation, the LHC will produce collisions at much higher frequency and energy than the Tevratron. The resulting torrent of data will require highly effective event selection, which can potentially be aided by the methods presented in this article.
The remainder of this paper is organized as follows. Section 2 overviews the process of producing, detecting, selecting, and analyzing events in modern high energy accelerators. Sections 3 and 4 describe methods for performing event selection with the aid of supervised learning and stochastic optimization, respectively. Sections 5 and 6 compare the performance of these methods on the problems of top quark mass measurement and Higgs boson search, respectively. Section 7 describes the use of our optimized selector to produce a new top quark mass measurement with data from the Tevatron. Section 8 discusses the the implications of these results, Section 9 outlines our plans for future work, and Section 10 concludes.
Section snippets
Events in high energy physics
This section provides an overview of the process of producing, detecting, selecting, and analyzing events at modern high energy accelerators, with particular focus on measurement of the top quark's mass and the search for the Higgs boson.
Supervised learning for event selection
This section describes how final event selection can be performed with the aid of supervised learning methods which maximize classification accuracy. In a narrow class of problems, these are equivalent to a likelihood ratio test, which the Neyman–Pearson Lemma (Neyman and Pearson, 1933) suggests that are optimal. This approach is standard in the physics community and serves as a point of comparison for the optimization approach that will be described in Section 4.
Supervised methods can be used
Optimization methods for event selection
The supervised approach described above is most effective in the narrow class of problems in which the classification accuracy of the event selector is closely correlated with the quality of the resulting data analysis. In this case, the Neyman–Pearson Lemma argues that such techniques are optimal.
However, top quark mass measurement and Higgs boson search exemplify a broader class of problems where higher classification accuracy does not necessarily result in better analysis. Instead, the
Comparative results for top quark mass measurement
To assess the efficacy of the methods presented in Sections 3 and 4, we evaluated each one on the top quark mass measurement problem, averaging performance over 10 independent runs, using 10,000 simulated events. These runs were conducted using 10-fold cross validation: in each run, 75% of the events are selected at random for training and the remaining 25% reserved for testing.
Comparative results for Higgs boson search
We also evaluated the relative performance of the supervised learning and optimization strategies on the problem of Higgs boson search. As before, each run was conducted using 10-fold cross validation with a 75%/25% split between training and testing.
Application to Tevatron events
The top quark was first observed in 1995 at the Fermilab Tevatron, which remains the sole accelerator with enough energy to produce it for direct study. The top quark's mass is extraordinary on the scale of fundamental particles, nearly 40 times larger than the second most massive quark and comparable to the mass of a gold atom. Its enormous mass remains a puzzle from a theoretical standpoint, so physicists have focused great effort on probing the precious few top quarks collected in the five
Discussion
The results presented in Sections 5 and 6 confirm the conclusion of earlier work (Abazov et al., 2001, Acosta et al., 2005, Whiteson and Naumann, 2003) that machine learning methods can substantially outperform heuristic event selectors. However, previous results demonstrated only that learned selectors had higher classification accuracy, while these results directly verify that they can improve the quality of the resulting analysis. More importantly, these results confirm the advantage of
Future work
We believe that the intersection of machine learning and high energy physics is a highly promising but under-explored research area. Thus, we hope that the results presented in this article will mark only the beginning of a long and fruitful effort to bridge the gap between these two disciplines. Such research can be a boon to both fields, by providing machine learning researchers with a challenging, realistic proving ground for their methods and arming high energy physicists with the tools to
Conclusion
This article describes the use of machine learning methods to aid the process of selecting events at high energy accelerators for the purpose of studying the fundamental nature of matter and its interactions. First, we apply supervised learning methods, which have succeeded previously in similar tasks. Second, we present a new approach that uses stochastic optimization techniques to directly search for selectors that maximize either the precision of top quark mass measurements or the
Acknowledgments
The authors thank Peter Stone, Risto Miikkulainen, Ken Stanley, Razvan Bunescu, Misha Bilenko, and Gwenn Englebienne for their comments and suggestions on this research.
References (52)
Search for single top production at DZero using neural networks
Physics Letters B
(2001)GEANT4
Nuclear Instruments and Methods in Physics Research A
(2003)- et al.
Selection of relevant features and examples in machine learning
Artificial Intelligence
(1997) - et al.
PhysicsGP: a genetic programming approach to event selection
Computer Physics Communications
(2005) Partial symmetries of weak interactions
Nuclear Physics
(1961)High-energy physics event generation with PYTHIA 6.1
Computer Physics Communications
(2001)- et al.
Support vector regression as a signal discriminator in high energy physics
Neurocomputing
(2003) Cross section constrained top quark mass measurement from dilepton events at the Tevatron
Physics Review Letters
(2008)Measurement of the top quark mass with dilepton events selected using neuroevolution at CDF
Physical Review Letters
(2009)Observation of top quark production in collisions with the collider detector at FermiLab
Physical Review Letters
(1995)