# Model Card for cms-2024-04-05 This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector. ## Model Details ### Model Description - **Developed by:** Joosep Pata, Farouk Mokhtar, Eric Wulff, Shivam Raj, Javier Duarte - **Model type:** full attention - **License:** Apache License ### Model Sources - **Repository:** https://github.com/jpata/particleflow/releases/tag/v1.7.0 - **Slides:** https://indico.cern.ch/event/1399688/#1-ml-for-pf ## Uses ### Direct Use This model may be used to study the physics and computational performance on ML-based reconstruction in CMS simulation. It should **only** be used **within** the CMS collaboration. ### Out-of-Scope Use This model is not intended for physics measurements on real data. It should **not** be used **outside** of the CMS collaboration. ## Bias, Risks, and Limitations The model has only been trained on simulation data and has not been validated against real data. ## How to Get Started with the Model ``` git clone https://github.com/jpata/particleflow/releases/tag/v1.7.0 cd particleflow wget https://hep.kbfi.ee/~joosep/pytorch.simg mkdir -p experiments/pyg-cms_20240324_235743_208080/checkpoints/ wget https://huggingface.co/jpata/particleflow/resolve/main/cms/2024_04_05/pyg-cms_20240324_235743_208080/checkpoint-32-17.877384.pth mv checkpoint-32-17.877384.pth experiments/pyg-cms_20240324_235743_208080/checkpoints/ wget https://huggingface.co/jpata/particleflow/raw/main/cms/2024_04_05/pyg-cms_20240324_235743_208080/train-config.yaml mv train-config.yaml experiments/pyg-cms_20240324_235743_208080/ #Run the inference on the held-out dataset singularity exec --nv pytorch.simg python3 mlpf/pyg_pipeline.py --config parameters/pytorch/pyg-cms.yaml --gpus 1 --experiments-dir experiments/ --dataset cms --conv-type attention --gpu-batch-multiplier 10 --dtype bfloat16 --load experiments/pyg-cms_20240324_235743_208080/checkpoints/checkpoint-32-17.877384.pth --test ``` ## Training Details Trained for 32 epochs on 1x A100 80GB for approximately 6 days. The training was done with bfloat16. ### Training Data Trained on 400k events from `cms_pf_ttbar`, version `v1.7.1`. The dataset is available at `/eos/user/j/jpata/mlpf/tensorflow_datasets/cms/cms_pf_ttbar/1.7.1`. https://github.com/jpata/particleflow/blob/v1.7.0/mlpf/heptfds/cms_pf/ttbar.py ## Model Card Contact Joosep Pata, joosep.pata@cern.ch