# Model Card for mlpf-cms-v2.1.0
This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.
## Model Details
The performance is measured with respect to generator-level jets and MET computed from Pythia particles, i.e. the truth-level jets and MET.
Jet performance
MET performance
### Model Description
- **Developed by:** CMS MLPF Team
- **Model type:** transformer
- **License:** Apache License
### Model Sources
- **Repository:** https://github.com/jpata/particleflow/releases/tag/v2.1.0
## Uses
### Direct Use
This model may be used to study the physics and computational performance on ML-based reconstruction in simulation within the CMS collaboration.
### Out-of-Scope Use
This model is not intended for physics measurements on real data or for use outside the CMS collaboration.
## Bias, Risks, and Limitations
The model has only been trained on simulation data and has not been validated against real data.
The model has not been peer reviewed or published in a peer-reviewed journal.
## How to Get Started with the Model
Use the code below to get started with the model.
```
#get the code
git clone https://github.com/jpata/particleflow
cd particleflow
git checkout v2.1.0
#get the models
git clone https://huggingface.co/jpata/particleflow models
```
## Training Details
Trained on 8x MI250X for 18 epochs over ~26 days.
The training was continued multiple times from a checkpoint due to the 24h time limit.
### Training Data
The following datasets were used:
```
179G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_qcd/2.5.0
84G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_qcd_nopu/2.5.0
179G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_ttbar/2.5.0
86G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_ttbar_nopu/2.5.0
173G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_ztt/2.5.0
57G /local/joosep/mlpf/tensorflow_datasets/cms/cms_pf_ztt_nopu/2.5.0
```
## Training Procedure
```bash
#!/bin/bash
#SBATCH --job-name=mlpf-train
#SBATCH --account=project_465000301
#SBATCH --time=3-00:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=400G
#SBATCH --gpus-per-task=8
#SBATCH --partition=small-g
#SBATCH --no-requeue
#SBATCH -o logs/slurm-%x-%j-%N.out
cd /scratch/project_465000301/particleflow
module load LUMI/24.03 partition/G
export IMG=/scratch/project_465000301/pytorch-rocm6.2.simg
export PYTHONPATH=`pwd`
export TFDS_DATA_DIR=/scratch/project_465000301/tensorflow_datasets
#export MIOPEN_DISABLE_CACHE=true
export MIOPEN_USER_DB_PATH=/tmp/${USER}-${SLURM_JOB_ID}-miopen-cache
export MIOPEN_CUSTOM_CACHE_DIR=${MIOPEN_USER_DB_PATH}
export TF_CPP_MAX_VLOG_LEVEL=-1 #to suppress ROCm fusion is enabled messages
export ROCM_PATH=/opt/rocm
#export NCCL_DEBUG=INFO
#export MIOPEN_ENABLE_LOGGING=1
#export MIOPEN_ENABLE_LOGGING_CMD=1
#export MIOPEN_LOG_LEVEL=4
export KERAS_BACKEND=torch
env
#TF training
singularity exec \
--rocm \
-B /scratch/project_465000301 \
-B /tmp \
--env LD_LIBRARY_PATH=/opt/rocm/lib/ \
--env CUDA_VISIBLE_DEVICES=$ROCR_VISIBLE_DEVICES \
$IMG python3 mlpf/pipeline.py --gpus 8 \
--data-dir $TFDS_DATA_DIR --config parameters/pytorch/pyg-cms.yaml \
--train --gpu-batch-multiplier 5 --num-workers 8 --prefetch-factor 50 --checkpoint-freq 1 --conv-type attention --dtype bfloat16 --lr 0.0001
```
## Evaluation
```bash
#!/bin/bash
#SBATCH --partition gpu
#SBATCH --gres gpu:mig:1
#SBATCH --mem-per-gpu 100G
#SBATCH -o logs/slurm-%x-%j-%N.out
IMG=/home/software/singularity/pytorch.simg:2024-08-18
cd ~/particleflow
WEIGHTS=experiments/pyg-cms_20241101_090645_682892/checkpoints/checkpoint-08-2.986092.pth
DATASET=$1
env
singularity exec -B /scratch/persistent --nv \
--env PYTHONPATH=`pwd` \
--env KERAS_BACKEND=torch \
$IMG python mlpf/pipeline.py --gpus 1 \
--data-dir /scratch/persistent/joosep/tensorflow_datasets --config parameters/pytorch/pyg-cms-nopu.yaml \
--test --make-plots --gpu-batch-multiplier 2 --load $WEIGHTS --ntest 50000 --dtype bfloat16 --num-workers 8 --prefetch-factor 10 --test-datasets $DATASET
```
## Citation
## Glossary
- PF: particle flow reconstruction
- MLPF: machine learning for particle flow
- CMS: Compact Muon Solenoid
## Model Card Contact
Joosep Pata, joosep.pata@cern.ch