FCCAnalyses
A Framework for FCC Physics Performance Studies
Juraj Smieško (CERN )
PyHEP.dev 2024
Aachen, Germany
29 August 2024
Future Circular Collider
Energy and luminosity upgrade in an integrated program
FCC-ee (Z, WW, H, ttbar):
Highest luminosities at Z, W, ZH among
proposed Higgs and EW factories with
indirect discovery potential up to ~ 70 TeV
FCC-hh (~100 TeV):
Direct exploration of next energy frontier (~ x10 LHC)
and unparalleled measurements
Feasibility Status Report in 2025
More than 150 institutes from 30 countries already involved
Set of common software packages, tools, and standards for
different Detector/Collider Concepts
Common for FCC, CLIC/ILC, CEPC, EIC, …
Individual participants can adjust their stack
Main cornerstones:
Data processing framework:
Gaudi
Event data model:
EDM4hep
Detector description:
DD4hep
Software distribution:
Spack
The edges of different parts of the analysis stack are becoming
better and better defined.
Key4hep is effort of several institutions to develop common software
stack.
EDM4hep I.
Describes event data with the set of standard objects.
Specification in a single YAML file
Generated with the help of
Podio
EDM4hep II.
#------------- CalorimeterHit
edm4hep::CalorimeterHit:
Description: "Calorimeter hit"
Author: "EDM4hep authors"
Members:
- uint64_t cellID // detector specific (geometrical) cell id
- float energy [GeV] // energy of the hit
- float energyError [GeV] // error of the hit energy
- float time [ns] // time of the hit
- edm4hep::Vector3f position [mm] // position of the hit in world coordinates
- int32_t type // type of hit
Current version: v0.99.0
Objects can be extended / new created
Bi-weekly discussion:
Indico
Generates Event Data Model and serves
as I/O Layer
Generates EDM from YAML files
Employs plain-old-data (POD) data structures
I/O machinery consists of three layers
POD Layer - actual data structures
Object Layer - helps resolve the relations
User Layer - full fledged EDM objects
Supports multiple backends:
Current version: 1.0.1
Podio Reader
Constructs the EDM4hep objects for the user
Example usage of Podio Reader in Pyhton:
FCCAnalyses Overview
Analysis framework build on top of ROOT RDataFrame with
input from EDM4hep
Dependent on Key4hep Stack
Manages input samples
Has standard library of functions/closures
Runs the dataframe
Helps with histograms/plots
Registry for the analyses
Explain each of the bullet points in more detail.
FCCAnalyses script
Typical analysis divided into several stages
Results between stages stored in ROOT files
Running of the script with: fccanalysis run ana_script.py
Input samples
FCCAnalyses manages input ROOT files for the user
Analysis operates on named samples (by process name)
Pre-generated samples identified with production tag
Registry of available samples available at FCC Physics
Events website
Local samples require input directory path
Process dictionary allows further parameters: fraction,
chunks, ...
Analyzers
Collection of standard functions/closures
Users define their dataframe in a class method
Output variables registered in a list
Additional analyzers JIT compiled
Running of the RDF
Execution of the dataframe hidden from the user
User can affect how the dataframe runs with global
attributes
Analysis can run locally or on HTCondor
Histograms/Plots
Last two stages of the analysis
User specifies output histograms
Histograms are combined into plots
Key4hep integration
FCCAnalyses is tied to the Key4hep stack
Distributed as a Spack package
Key4hep environment needed for running
EDM4hep objects read directly from ROOT files
Building from source expects Key4hep
People pin their analysis to the particular stack version
Integration with Existing Tools
Boundary between reconstruction and analysis blurred
Especially for full-sim
Plan: Develop algorithm on analysis side, then
move to reconstruction
Many C++ tools/libraries created over the years
Most are integrated into the Key4hep stack
At the moment we have:
ROOT — together with RDataFrame
ACTS — track reconstruction tools
ONNX — neural network exchange format
FastJet — jet finding package
DD4hep — detector description
Delphes — fast simulations
Analysis registry
Central registry for the FCC-ee analyses
In the repository
FCCeePhysicsPerformance
FCCee analyses are listed
Experimental: One can create analysis
package for analysis specific code
Conclusions & Outlook
The combination of EDM4hep and RDataFrame works well for the FSR
Physics Studies based on Delphes Fastsim
Performant
Possibility to integrate range of existing (C++) libraries
Started focusing on the Geant4 Fullsim detector studies
Writing of an analysis without compilation preferred
Access to the detector description through the framework
Better integration into Python tooling
ML integration needs more thought
More complex collection relationships complicated
Bi-weekly FCC meeting focused on analysis framework development,
but more importantly on the analysis tools