# Model Card for mlpf-clic-clusters-v2.1.0 This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector. ## Model Details The performance is measured with respect to generator-level jets and MET computed from Pythia particles, i.e. the truth-level jets and MET.
Jet performance ttbar jet resolution qq jet resolution ttbar jet resolution
MET performance ttbar MET resolution qq MET resolution ttbar MET resolution
### Model Description - **Developed by:** Joosep Pata, Eric Wulff, Farouk Mokhtar, Mengke Zhang, David Southwick, Maria Girone, David Southwick, Javier Duarte, Michael Kagan - **Model type:** transformer - **License:** Apache License ### Model Sources - **Repository:** https://github.com/jpata/particleflow/releases/tag/v2.1.0 ## Uses ### Direct Use This model may be used to study the physics and computational performance on ML-based reconstruction in simulation. ### Out-of-Scope Use This model is not intended for physics measurements on real data. ## Bias, Risks, and Limitations The model has only been trained on simulation data and has not been validated against real data. The model has not been peer reviewed or published in a peer-reviewed journal. ## How to Get Started with the Model Use the code below to get started with the model. ``` #get the code git clone https://github.com/jpata/particleflow cd particleflow git checkout v2.1.0 #get the models git clone https://huggingface.co/jpata/particleflow models ``` ## Training Details Trained on 8x MI250X for 26 epochs over ~5 days. The training was continued several times from a checkpoint due to a runtime limit. ### Training Data The following datasets were used: ``` 47G /local/joosep/mlpf/tensorflow_datasets/clic/clic_edm_qq_pf/2.5.0 93G /local/joosep/mlpf/tensorflow_datasets/clic/clic_edm_ttbar_pf/2.5.0 74G /local/joosep/mlpf/tensorflow_datasets/clic/clic_edm_ww_fullhad_pf/2.5.0 ``` The datasets were generated using Key4HEP with the following scripts: - https://github.com/HEP-KBFI/key4hep-sim/releases/tag/v1.1.0 - https://github.com/HEP-KBFI/key4hep-sim/blob/v1.1.0/clic/run_sim.sh ## Training Procedure ```bash #!/bin/bash #SBATCH --job-name=mlpf-train #SBATCH --account=project_465000301 #SBATCH --time=3-00:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=32 #SBATCH --mem=200G #SBATCH --gpus-per-task=8 #SBATCH --partition=small-g #SBATCH --no-requeue #SBATCH -o logs/slurm-%x-%j-%N.out cd /scratch/project_465000301/particleflow module load LUMI/24.03 partition/G export IMG=/scratch/project_465000301/pytorch-rocm6.2.simg export PYTHONPATH=`pwd` export TFDS_DATA_DIR=/scratch/project_465000301/tensorflow_datasets #export MIOPEN_DISABLE_CACHE=true export MIOPEN_USER_DB_PATH=/tmp/${USER}-${SLURM_JOB_ID}-miopen-cache export MIOPEN_CUSTOM_CACHE_DIR=${MIOPEN_USER_DB_PATH} export TF_CPP_MAX_VLOG_LEVEL=-1 #to suppress ROCm fusion is enabled messages export ROCM_PATH=/opt/rocm #export NCCL_DEBUG=INFO #export MIOPEN_ENABLE_LOGGING=1 #export MIOPEN_ENABLE_LOGGING_CMD=1 #export MIOPEN_LOG_LEVEL=4 export KERAS_BACKEND=torch env #TF training singularity exec \ --rocm \ -B /scratch/project_465000301 \ -B /tmp \ --env LD_LIBRARY_PATH=/opt/rocm/lib/ \ --env CUDA_VISIBLE_DEVICES=$ROCR_VISIBLE_DEVICES \ $IMG python3 mlpf/pipeline.py --gpus 8 \ --data-dir $TFDS_DATA_DIR --config parameters/pytorch/pyg-clic.yaml \ --train --gpu-batch-multiplier 128 --num-workers 8 --prefetch-factor 100 --checkpoint-freq 1 --conv-type attention --dtype bfloat16 --lr 0.0001 --num-epochs 50 ``` ## Evaluation ```bash #!/bin/bash #SBATCH --partition gpu #SBATCH --gres gpu:mig:1 #SBATCH --mem-per-gpu 200G #SBATCH -o logs/slurm-%x-%j-%N.out IMG=/home/software/singularity/pytorch.simg:2024-08-18 cd ~/particleflow WEIGHTS=experiments/pyg-clic_20241106_104416_929167/checkpoints/checkpoint-20-1.914489.pth singularity exec -B /scratch/persistent --nv \ --env PYTHONPATH=`pwd` \ --env KERAS_BACKEND=torch \ $IMG python3 mlpf/pipeline.py --gpus 1 \ --data-dir /scratch/persistent/joosep/tensorflow_datasets --config parameters/pytorch/pyg-clic.yaml \ --test --make-plots --gpu-batch-multiplier 100 --load $WEIGHTS --dtype bfloat16 --prefetch-factor 10 --num-workers 8 --ntest 50000 ``` ## Citation ## Glossary - PF: particle flow reconstruction - MLPF: machine learning for particle flow - CLIC: Compact Linear Collider ## Model Card Contact Joosep Pata, joosep.pata@cern.ch ## Full outputs ``` /local/joosep/mlpf/results/clic/pyg-clic_20241106_104416_929167 ```