File size: 3,493 Bytes
12828b6
d9a3512
 
 
 
 
 
 
 
 
12828b6
d9a3512
 
 
 
 
 
86694a5
d9a3512
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# Model Card for mlpf-cms-2024_05_16_attn_model21M

This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.
## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** Joosep Pata, Eric Wulff, Farouk Mokhtar, Mengke Zhang, David Southwick, Maria Girone, David Southwick, Javier Duarte
- **Model type:** transformer with 2x6 layers, 32 heads, head dim 16
- **License:** Apache License

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/jpata/particleflow/releases/tag/v1.8.0

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

This model may be used to study the physics and computational performance on ML-based reconstruction in simulation in the CMS collaboration.

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

This model is not intended for physics measurements on real data our use outside the CMS collaboration. 

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

The model has only been trained on simulation data and has not been validated against real data.
It's only meant for internal CMS use.

## Training Details

Approximately 2 weeks on 1x A100 80GB.
```
https://www.comet.com/jpata/particleflow-pt/9ceb52e8f9f54d7eb4ef06c9ff85bef2?compareXAxis=step&experiment-tab=panels&showOutliers=true&smoothing=0&xAxis=epoch
https://www.comet.com/jpata/particleflow-pt/35bb92e72a3846ff98fb563b0769be13?compareXAxis=step&experiment-tab=panels&showOutliers=true&smoothing=0&xAxis=epoch0 
```

### Training Data

CMS ttbar, QCD, Ztautau with pileup, v1.7.1, 400k events each.
```
134G	/eos/user/j/jpata/mlpf/tensorflow_datasets/cms/cms_pf_qcd/1.7.1
135G	/eos/user/j/jpata/mlpf/tensorflow_datasets/cms/cms_pf_ttbar/1.7.1
130G	/eos/user/j/jpata/mlpf/tensorflow_datasets/cms/cms_pf_ztt/1.7.1
```
### Training Procedure 

```
#!/bin/bash
#SBATCH --partition gpu
#SBATCH --gres gpu:a100:1
#SBATCH --mem-per-gpu 80G

IMG=/home/software/singularity/pytorch.simg:2024-04-30
cd ~/particleflow
git checkout 8bd199fb064bb40558466d906d46498218848e5c

singularity exec --nv \
    --env PYTHONPATH=hep_tfds \
    --env KERAS_BACKEND=torch \
    $IMG python3.10 mlpf/pyg_pipeline.py --dataset cms --gpus 1 \
    --data-dir /path/to/tensorflow_datasets --config parameters/pytorch/pyg-cms.yaml \
    --train --conv-type attention --num-epochs 100 --gpu-batch-multiplier 40 --num-workers 4 --prefetch-factor 50 --checkpoint-freq 1 --comet
```

## Evaluation

```
WEIGHTS=pyg-cms_20240430_094836_751206/checkpoints/checkpoint-25-17.631161.pth
singularity exec -B /scratch/persistent --nv \
     --env PYTHONPATH=hep_tfds \
     --env KERAS_BACKEND=torch \
     $IMG  python3.10 mlpf/pyg_pipeline.py --dataset cms --gpus 1 \
     --data-dir /path/to/joosep/tensorflow_datasets --config parameters/pytorch/pyg-cms.yaml \
     --test --make-plots --conv-type attention --gpu-batch-multiplier 10 --num-workers 8 --prefetch-factor 10 --load $WEIGHTS --test-datasets cms_pf_ttbar --ntest 50000
```

## Model Card Contact

Joosep Pata, [email protected]