# Model Card for cms-2024-04-05

This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.

## Model Details

### Model Description

- **Developed by:** Joosep Pata, Farouk Mokhtar, Eric Wulff, Shivam Raj, Javier Duarte
- **Model type:** full attention
- **License:** Apache License

### Model Sources

- **Repository:** https://github.com/jpata/particleflow/releases/tag/v1.7.0
- **Slides:** https://indico.cern.ch/event/1399688/#1-ml-for-pf

## Uses

### Direct Use

This model may be used to study the physics and computational performance on ML-based reconstruction in CMS simulation.
It should **only** be used **within** the CMS collaboration.

### Out-of-Scope Use

This model is not intended for physics measurements on real data. 
It should **not** be used **outside** of the CMS collaboration.

## Bias, Risks, and Limitations

The model has only been trained on simulation data and has not been validated against real data.

## How to Get Started with the Model

```
git clone https://github.com/jpata/particleflow/releases/tag/v1.7.0
cd particleflow

wget https://hep.kbfi.ee/~joosep/pytorch.simg

mkdir -p experiments/pyg-cms_20240324_235743_208080/checkpoints/

wget https://huggingface.co/jpata/particleflow/resolve/main/cms/2024_04_05/pyg-cms_20240324_235743_208080/checkpoint-32-17.877384.pth
mv checkpoint-32-17.877384.pth experiments/pyg-cms_20240324_235743_208080/checkpoints/

wget https://huggingface.co/jpata/particleflow/raw/main/cms/2024_04_05/pyg-cms_20240324_235743_208080/train-config.yaml
mv train-config.yaml experiments/pyg-cms_20240324_235743_208080/

#Run the inference on the held-out dataset
singularity exec --nv pytorch.simg python3 mlpf/pyg_pipeline.py --config parameters/pytorch/pyg-cms.yaml --gpus 1 --experiments-dir experiments/ --dataset cms --conv-type attention --gpu-batch-multiplier 10 --dtype bfloat16 --load experiments/pyg-cms_20240324_235743_208080/checkpoints/checkpoint-32-17.877384.pth --test
```

## Training Details

Trained for 32 epochs on 1x A100 80GB for approximately 6 days.
The training was done with bfloat16.

### Training Data

Trained on 400k events from `cms_pf_ttbar`, version `v1.7.1`.
The dataset is available at `/eos/user/j/jpata/mlpf/tensorflow_datasets/cms/cms_pf_ttbar/1.7.1`.

https://github.com/jpata/particleflow/blob/v1.7.0/mlpf/heptfds/cms_pf/ttbar.py

## Model Card Contact

Joosep Pata, joosep.pata@cern.ch