File size: 3,473 Bytes
560e2d0
 
 
 
 
 
 
 
 
 
25cc1a0
560e2d0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# Model Card for mlpf-clic-clusters-v1.6

This model reconstructs particles in a detector, based on the tracks and calorimeter clusters recorded by the detector.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** Joosep Pata, Eric Wulff, Farouk Mokhtar, Mengke Zhang, David Southwick, Maria Girone, David Southwick, Javier Duarte
- **Model type:** graph neural network with learnable structure in locality-sensitive hashing bins
- **License:** Apache License

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/jpata/particleflow/releases/tag/v1.6
- **Paper:** https://doi.org/10.48550/arXiv.2309.06782

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

This model may be used to study the physics and computational performance on ML-based reconstruction in simulation.

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

This model is not intended for physics measurements on real data. 

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

The model has only been trained on simulation data and has not been validated against real data.

## How to Get Started with the Model

Use the code below to get started with the model.

```
git clone https://github.com/jpata/particleflow/releases/tag/v1.6
cd particleflow

#Download the software image
wget https://hep.kbfi.ee/~joosep/tf-2.14.0.simg

#Download the checkpoint
wget https://huggingface.co/jpata/particleflow/resolve/clic_clusters_v1.6/weights-96-5.346523.hdf5
wget https://huggingface.co/jpata/particleflow/resolve/clic_clusters_v1.6/opt-96-5.346523.pkl

#Launch a shell in the software image
apptainer shell --nv tf-2.14.0.simg

#Continue the training from a checkpoint
python3 mlpf/pipeline.py train --config parameters/clic.yaml --weights weights-96-5.346523.hdf5 --batch-multiplier 0.5

#Run the evaluation for a given training directory, loading the best weight file in the directory
python3 mlpf/pipeline.py evaluate --train-dir experiments/clic-REPLACEME
```

## Training Details

### Training Data

Trained on the following dataset:
Pata, J., Wulff, E., Duarte, J., Mokhtar, F., Zhang, M., Girone, M., & Southwick, D. (2023). Simulated datasets for detector and particle flow reconstruction: CLIC detector, machine learning format (v1.5.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8409592

### Training Procedure 

```
python3 mlpf/pipeline.py train --config parameters/clic.yaml
```

## Evaluation

```
python3 mlpf/pipeline.py evaluate --train-dir experiments/clic-REPLACEME
```

## Citation

**BibTeX:**

```
@misc{pata2023scalable,
      title={Scalable neural network models and terascale datasets for particle-flow reconstruction}, 
      author={Joosep Pata and Eric Wulff and Farouk Mokhtar and David Southwick and Mengke Zhang and Maria Girone and Javier Duarte},
      year={2023},
      eprint={2309.06782},
      archivePrefix={arXiv},
      primaryClass={physics.data-an}
}
```
## Glossary

PF - particle flow reconstruction

## Model Card Contact

Joosep Pata, [email protected]