Tesla: An application for real-time data analysis in High Energy Physics☆
Introduction
The LHCb experiment, one of the four main detectors situated on the Large Hadron Collider in CERN, Geneva, specialises in precision measurements of beauty and charm hadrons decays. At the nominal LHCb luminosity of during 2012 data taking at 8 TeV, around 30 k beauty () and 600 k charm () hadron pairs pass through the LHCb detector each second. Each recorded collision is defined as an event that can possibly contain a decay of interest. The efficient selection of beauty and charm decays from the proton–proton collisions per second is a significant Big Data challenge.
An innovative feature of the LHCb experiment is its approach to Big Data in the form of the High Level Trigger (HLT) that is split into two components with a buffer stage between [1]. The HLT is a software application designed to reduce the event rate from 1 M to events per second and is executed on an Event Filter Farm (EFF). The EFF is a computing cluster consisting of 1800 server nodes, with a combined storage space of 5.2 PB, which can accommodate up to two weeks of LHCb data taking [2] in nominal conditions. The HLT application reconstructs the particle trajectories of the event in real time, where real time is defined as the interval between the collision in the detector and the moment the data are sent to permanent storage. The event reconstruction in the EFF is denoted as the online reconstruction.
In Run-I (2010–2012) of the LHC, the LHCb experiment used a processing model in which all events accepted by the HLT were sent to permanent offline storage containing all raw information from the detector. An additional event reconstruction performed on the LHC Computing Grid [3], denoted as the offline reconstruction, recreates particles in the event from the raw data using an improved detector calibration.
The upgrade of the computing infrastructure during the first long shutdown of the LHC (2013–2014), combined with efficient use of the EFF storage, provides resources for an online reconstruction in LHC Run-II (2015–2018) with a similar quality to that of the offline reconstruction. This is achieved through real-time automated calculation of the final calibrations of the sub-detectors.
With fully calibrated detector information available at the HLT level, it is possible to perform physics analyses with the information calculated by the HLT event reconstruction. In the Turbo stream, a compact event record is written directly from the trigger and is prepared for physics analysis by the Tesla application, which is named following the LHCb renowned physicist convention. This bypasses the offline reconstruction. The performance of a final analysis quality event reconstruction already in real time as the data arrive has the power to transform the experimental approach to processing large quantities of data.
The data acquisition framework is described in Section 2. An overview of the upgrades to the trigger and calibration framework in Run-II is provided in Section 3. The implementation of the Turbo stream including that of the Tesla application is described in Section 4, followed by the future prospects of the data model in Section 5.
Section snippets
The LHCb detector, data acquisition, and trigger strategy
The LHCb detector is a forward arm spectrometer designed to measure the properties of the decays of -hadrons with high precision [4]. Such decays are predominantly produced at small angles with respect to the proton beam axis [5]. This precision is obtained with an advanced tracking system consisting of a silicon vertex detector surrounding the interaction region (VELO), a silicon strip detector located upstream of the dipole magnet (TT), and three tracking stations downstream of the magnet,
Run-II data taking
During Run-I data taking, a buffer was created between the hardware trigger and the first software trigger level, deferring 20% of the events passing the hardware trigger and thereby utilising the EFF when the LHC was not providing proton collisions. The replacement of this buffer with the one between the two software levels introduces a complete split between an initial stage processing events directly from the L0 (HLT1) and an asynchronous stage (HLT2), ensuring the EFF is used optimally.
From
Implementation of the turbo stream
The concept of the Turbo stream is to provide a framework by which physics analyses can be performed using the online reconstruction directly. The schematic data flow of the Turbo stream compared to the traditional data flow (represented by the Full stream) is depicted in Fig. 2.
In the traditional data flow, raw event data undergoes a complete reconstruction taking 24 h for 3 GB of input data on a typical batch node. This additional reconstruction was designed for a data processing model in
Outlook and future prospects
The use of the Turbo stream in 2015 proved to be successful. The first two published physics measurements from the LHCb experiment based on data collected in the 2015 run were based on the Turbo stream [9], [10]. Around half of the HLT2 trigger lines currently persist the trigger reconstruction using the Turbo stream.
Summary
The Tesla toolkit allows for analyses to be based on the event reconstruction that is performed by the LHCb HLT. By design, the Tesla output files are compatible with existing analysis framework software with minimal changes required from analysts.
The event reconstruction performed by the HLT is of sufficient quality for use in physics analyses because the detector is aligned and calibrated in real time during data taking. This is in turn made possible through the upgraded computing
Acknowledgements
We thank the technical and administrative staff at the LHCb institutes. We acknowledge support from CERN and from the national agencies: CAPES, CNPq, FAPERJ and FINEP (Brazil); CNRS/IN2P3 (France); BMBF, DFG and MPG (Germany); INFN (Italy); FOM and NWO (The Netherlands); SNSF and SER (Switzerland); STFC (United Kingdom). We acknowledge the computing resources that are provided by CERN, IN2P3 (France), KIT and DESY (Germany), INFN (Italy), SURF (The Netherlands), PIC (Spain), GridPP (United
References (11)
Comput. Phys. Comm.
(2001)JINST
(2013)Int. J. Mod. Phys. A
(2015)- et al.
J. Phys. Conf. Ser.
(2012) JINST
(2008)
Cited by (124)
Review of real-time data processing for collider experiments
2023, European Physical Journal PlusSearch for the rare decays W+→D+sγ and Z→D0γ at LHCb
2023, Chinese Physics C