Automatizing the online filter test management for a general-purpose particle detector

https://doi.org/10.1016/j.cpc.2010.10.003Get rights and content

Abstract

This paper presents a software environment to automatically configure and run online triggering and dataflow farms for the ATLAS experiment at the Large Hadron Collider (LHC). It provides support for a broad set of users, with distinct knowledge about the online triggering system, ranging from casual testers to final system deployers. This level of automatization improves the overall ATLAS TDAQ work flow for software and hardware tests and speeds-up system modifications and deployment.

Introduction

The Large Hadron Collider (LHC) [1], at CERN, is the world's newest and most powerful tool for particle physics research. It is designed to collide 14 TeV proton beams at an unprecedented luminosity of 1034 cm2s1. It can also collide heavy (Pb) ions with an energy of 2.8 TeV per nucleon at a peak luminosity of 1027 cm2s1. During nominal operation, LHC will provide collisions at a 40 MHz rate. To analyze the sub-products resulting from collisions, state-of-art detectors are placed around the LHC collision points.

Among LHC detectors, ATLAS (A Toroidal LHC Apparatus) [2] is the largest. ATLAS comprises multiple, fine-granularity sub-detectors for tracking, calorimetry and muon detection (see Fig. 1). Due to ATLAS high-resolution sub-detectors, 1.5 MB of information is expected per recorded event. Considering the machine clock, a data rate of 60TB/s will be produced, mostly consisting of known physics processes. To achieve a sizeable reduction in the recorded event rate, an online triggering system, responsible for retrieving only the most interesting physics channels, is put in place. This system is constructed using a three-level architecture [3]. The first-level trigger performs coarse granularity analysis using custom hardware. The next 2 levels (High Level Triggers – HLT) operate using full detector granularity and use complex data analysis techniques to further reduce the event rate [4].

The HLT software is developed using high-level programming languages and is executed using off-the-shelf workstations running Linux and interconnected through Gigabit Ethernet. Data are provided to HLT by specialized Data Acquisition (DAQ) modules known as Readout Systems (ROS). Both HLT and DAQ are referred to as Trigger and Data Acquisition (TDAQ). The TDAQ comprises thousands of devices, and has been designed to be highly configurable. To provide the run configuration of all TDAQ devices, a special database system was developed [5].

While there is a single production version of the TDAQ running at the ATLAS experiment, the TDAQ software itself can be run on many different testbeds for testing and evolution [6], [7]. Setting device and application parameters for a given TDAQ run is a complex task that requires very specialized knowledge. Misconfigurations may result in TDAQ operation errors, unnecessarily wasting project resources. Additionally, the TDAQ project involves more than 400 collaborators, with different testing interests. There are users who simply want to execute TDAQ with standard parameters, while developers might want to test special cases, requiring more specific adjustments.

To help all TDAQ users and final deployers, a scriptable environment was designed. Such environment was developed based on a multi-layer approach. Users with limited knowledge regarding TDAQ will feel more comfortable working at the highest layer, as it will demand only a few high-level parameters. On the other hand, experts might profit from the lower-level layers, since they allow the configuration of very specific parameters, for customized TDAQ operation.

Although the proposed environment automatizes the TDAQ configuration task, its execution still needs to be manually started, and running parameters (accept rate, number of processing cores running, etc.) observed from the screen. This can be a tedious task, as tests of new software releases or hardware upgrades must be performed on a regular basis for different running setups. Therefore, an automated TDAQ execution environment was also developed. Such an environment is capable of automatically setting up TDAQ, as well as, for a given amount of time, collect monitoring information about its execution. At the end, post-processing analysis can take place, and results (graphics and histograms) be generated for further analysis.

The remaining of this paper is organized as follows. Section 2 details the trigger and data acquisition modules. Section 3 describes the configuration database service used to provide configuration parameters to the TDAQ infrastructure. In Section 4, the developed environment for configuring the TDAQ infrastructure is shown. In sequence, Section 5 shows the environment created for automatically operate the TDAQ system. Conclusions are derived in Section 6.

Section snippets

Online triggering system

As already mentioned, ATLAS requires an online triggering system for coping with the huge stream of data generated by LHC collisions. Fig. 2 shows such triggering system in details.

The first level (L1) has a latency time of only 2.5 μs per event. To achieve the required processing speed, this level operates only on calorimeter and fast muon detection information. The L1 system will reduce the event rate to 75 kHz. Also, this level is responsible for selecting the detector regions effectively

Configuration database service

The TDAQ description given in Section 2 makes clear the complexity of its nature. Besides, TDAQ was designed to be highly configurable, aiming at supporting a broad range of running purposes. As a result, in order to run a given TDAQ setup, one must provide the running configuration parameters for each TDAQ module. These configuration values (queue sizes, execution nodes, communication protocol, etc.) must be supplied, during TDAQ setup, for thousands of different applications. Therefore, there

Automatic database configuration

When a TDAQ execution is envisaged, several considerations must be analyzed prior to the generation of the configuration file, for instance:

  • Which TDAQ sections (L2, EB, EF) to enable? If EF is desired, for instance, EB must be present as well.

  • Which modules per section should be used? If data are coming from the detectors, then the RoIB must be enabled. If Monte Carlo simulation is being used, the RoIB may be disabled.

  • How many applications of each module should be used? It is an inconsistency to

Automatic testing of the ATLAS trigger system

As most detector systems, ATLAS makes use of a Finite State Machine (FSM) to drive its components into running state [17]. Control is led by specialized applications that form a structured tree of control. Commands are passed via a graphical user interface to the run control root and broadcasted down to the several nodes of the system in an orderly manner. Here are the different relevant state transitions that one must drive the system through until run data can be collected:

  • 1.

    Boot: at this

Conclusions

The TDAQ infrastructure, by being highly configurable, may require specialized knowledge for its setup and deployment. A flexible configuration database service (OKS) was put in place to cope with its stringent conditions.

The PartitionMaker is a configuration environment bound into Python, based on OKS. Because of its multi-layer architecture, users with different levels of expertise can quickly achieve complex configuration schemes, supported by the expert knowledge built-in. Functionalities

Acknowledgements

The authors would like to thank CAPES, CNPq, FAPERJ, FINEP (Brazil), CERN and European Union for their financial support. Also, we would like to thank the TDAQ collaboration of ATLAS for discussions concerning this work.

References (18)

  • K. Kordas

    The ATLAS data acquisition and trigger: concept, design and status

    Nucl. Phys. B Proc. Suppl.

    (2007)
  • R. Brun et al.

    ROOT – an object oriented data analysis framework

    Nucl. Instrum. Methods Phys. Res.

    (1997)
  • L. Evans et al.

    LHC machine

    Journal of Instrumentation

    (2008)
  • The ATLAS experiment at the CERN large hadron collider

    Journal of Instrumentation

    (2008)
  • I. Riu

    Integration of the trigger and data acquisition systems in ATLAS

    IEEE Trans. Nucl. Sci.

    (2008)
  • R. Jones et al.

    The OKS persistent in-memory object manager

    IEEE Trans. Nucl. Sci.

    (1998)
  • D. Burckhart-Chromek, et al., Testing on a large scale: Running the ATLAS data acquisition and high level trigger...
  • G. Unel, et al., Studies with the ATLAS trigger and data acquisition pre-series setup, in: Proceedings of the Computing...
  • G. Aielli, et al., Status of the ATLAS level-1 central trigger and muon barrel trigger and first results from...
There are more references available in the full text version of this article.

Cited by (0)

View full text