Analysis Tools and Productions
Juraj Smieško (CERN)
22 April 2026
One unit of local work producing one output ROOT file. Encapsulates the full RDataFrame lifecycle.
run_fccanalysis.py
job = Job(input_file_list, analysis_chain, use_data_source=False)
job.setup_output(output_filepath, output_variables)
job.enable_progress_bar() # optional
job.restrict_events(n_events_max=1000, stride=2)
job.run() # triggers the RDataFrame event loop
job.finalize() # writes metadata to the output ROOT file
n_events, elapsed = job.get_benchmark_info()
raw-orig / sow-orig
raw-ttree
raw-init / sow-init
raw-restricted / sow-restricted
raw-final / sow-final
All counts written as TParameter objects into a fccana/ directory in the output ROOT file.
================================ SUMMARY ================================
Elapsed time (HH:MM:SS): 00:00:03
Number of events processed: 10,000
Events processed per second: 3,012
Sum of weights processed: 9,823
Number of result events: 4,217
Local number of events reduction factor: 0.4217
Total number of events available: 500,000
Total reduction factor: 0.008434
=========================================================================
process → sampleprod_tag → campaign<accelerator>/<season-and-year>/<detector><generator>_<process>_<energy>validate_sample_list() functionvalidate_sample_list()Normalises and validates the per-sample dictionary from analysis scripts.
Deprecations
(warns, but still works)
input_dir → input-diroutput → output-stemValidated keys
input-diroutput-stemfractionchunksstride newn-events-max newUsed by both run_fccanalysis.py and batch.py.
# analysis_stage1.py
class Analysis:
samples = {
"p8_ee_ZH_ecm240": {
"fraction": 0.5,
"chunks": 4,
"stride": 2, # process every 2nd event
"n-events-max": 50000, # cap at 50k events
}
}
Both stride and n-events-max are also available as CLI arguments when running over a test file or an independent sample.
Input
class Analysis:
samples = {"p8_ee_ZH": {}}
campaign = "winter2023"
samples = {"p8_ee_ZH": {"input-dir": "/eos/..."}}
fccanalysis run ana.py -i file1.root file2.root
fccanalysis run ana.py -f files.txt
Output
class Analysis:
output_dir = "./output/"
analysis_name = "my_analysis"
fccanalysis run ana.py --output-dir ./out/ \
-a my_analysis
samples = {"p8_ee_ZH": {"output-stem": "ZH"}}
fccanalysis run ana.py -i file.root \
-o result.root
fccanalysis runGeneral
--output-dir-a / --analysis-name--apply-filepath-rewrites--no-filepath-rewrites--n-events
(alias for --nevents)Independent sample
-s / --sample-name--n-chunks--stride--test-fileJob class — one unit of local workprocess renamed to sample
process_list → samples
prod_tag renamed to campaign
stride and n-events-maxfccanalysis runvalidate_sample_list()[DEPRECATED] messages now follow a consistent style:
[DEPRECATED] Please use "X" instead of "Y"!
ctest now go to
${CMAKE_BINARY_DIR} instead of the source tree.
fccanalysis-run(1) documents all new CLI arguments.fccanalysis-script(7) documents stride
and n-events-max per-sample keys.ROOT.Experimental → ROOT.ROOT namespace fix.