soury commited on
Commit
2e85345
·
verified ·
1 Parent(s): 58f078c

push_data_pr (#2)

Browse files

- push json file to the dataset using a pr (43a2b78f1895422e99a5687323d47b142387c92f)
- crate json file well named (8bd8fa8db0eb8e86ee7889f3f0e49e9951629202)
- crate json file well named (e794a4b709696e06fe07770145a211e4ff5ba05a)
- fix pb of local files generated (beee3d386da0701718608a71ef31563c6f50aed8)
- better handling of dynamic sections to keep in memory fields that are already filled (0662adb28de0eddcf7afcd7ece6722e21d58ea07)
- handle problems with dynamic sections and implement boamps format validator (e2adf94cd8d9918a6463e2187ebb7e34bcc7248d)
- code refacto and cleaning (488a9f6127c0ee4a5ba79aaeb5264812fabffd80)

README.md CHANGED
@@ -11,20 +11,103 @@ license: apache-2.0
11
  short_description: Create a report in BoAmps format
12
  ---
13
 
 
 
14
  This tool is part of the initiative [BoAmps](https://github.com/Boavizta/BoAmps).
15
  The purpose of the BoAmps project is to build a large, open, database of energy consumption of IT / AI tasks depending on data nature, algorithms, hardware, etc., in order to improve energy efficiency approaches based on empiric knowledge.
16
 
17
  This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
18
 
19
- ### Development
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- Install prerequisites :
22
- Python >= 3.12
23
- Pip & Pipenv
 
 
 
24
 
25
- Clone and open the project
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
- Create a virtual environment: >python -m venv .venv
28
- Activate it: >.\.venv\Scripts\activate
29
- Install dependencies: >pipenv install -d
30
- Launch the application: pipenv run python main.py
 
11
  short_description: Create a report in BoAmps format
12
  ---
13
 
14
+ # BoAmps Report Creation Tool 🌿
15
+
16
  This tool is part of the initiative [BoAmps](https://github.com/Boavizta/BoAmps).
17
  The purpose of the BoAmps project is to build a large, open, database of energy consumption of IT / AI tasks depending on data nature, algorithms, hardware, etc., in order to improve energy efficiency approaches based on empiric knowledge.
18
 
19
  This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
20
 
21
+ ## 🚀 Quick Start
22
+
23
+ ### Prerequisites
24
+ - **Python** >= 3.12
25
+
26
+
27
+ ### Installation Steps
28
+
29
+ 1. **Clone the repository**
30
+
31
+ 2. **Create and activate virtual environment (not mandatory)**
32
+ ```bash
33
+ # Windows
34
+ python -m venv .venv
35
+ .\.venv\Scripts\activate
36
+
37
+ # Linux/MacOS
38
+ python -m venv .venv
39
+ source .venv/bin/activate
40
+ ```
41
+
42
+ 3. **Install dependencies**
43
+ ```bash
44
+ pip install pipenv
45
+ pipenv install --dev
46
+ ```
47
+
48
+ 4. **Launch the application**
49
+ ```bash
50
+ python ./app.py
51
+ ```
52
+
53
+ 5. **Access the application**
54
+ - Open your browser and go to `http://localhost:7860`
55
+ - The Gradio interface will be available for creating BoAmps reports
56
+
57
+ ## 🏗️ Architecture Overview
58
+
59
+ ### Core Components
60
+
61
+ 1. **`app.py`** - Main application file
62
+ - Initializes the Gradio interface
63
+ - Orchestrates all UI components
64
+ - Handles application routing and main logic
65
 
66
+ 2. **Services Layer (`src/services/`)**
67
+ - **`json_generator.py`**: Generates BoAmps-compliant JSON reports
68
+ - **`report_builder.py`**: Constructs structured report data
69
+ - **`form_parser.py`**: Processes and validates form inputs
70
+ - **`dataset_upload.py`**: Manages Hugging Face dataset integration
71
+ - **`util.py`**: Common utility functions
72
 
73
+ 3. **UI Layer (`src/ui/`)**
74
+ - **`form_components.py`**: Gradio interface components for different report sections
75
+
76
+ 4. **Assets & Validation (`assets/`)**
77
+ - **`validation.py`**: BoAmps schema validation logic
78
+ - **`app.css`**: Application styling
79
+
80
+ ### Data Flow
81
+
82
+ ```
83
+ User Input (Gradio Form)
84
+
85
+ Form Parser & Validation
86
+
87
+ JSON Generator
88
+
89
+ Report Builder
90
+
91
+ BoAmps Schema Validation
92
+
93
+ JSON Report Output
94
+ ```
95
+
96
+ ## 🤝 Contributing
97
+
98
+ Contributions are welcome! Please:
99
+
100
+ 1. Fork the repository
101
+ 2. Create a feature branch
102
+ 3. Make your changes
103
+ 4. Submit a pull request
104
+
105
+ ## 📄 License
106
+
107
+ This project is licensed under the Apache 2.0 License - see the license information in the repository header.
108
+
109
+ ## 🙏 Acknowledgments
110
+
111
+ This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
112
 
113
+ For more information about the BoAmps initiative, visit the [official repository](https://github.com/Boavizta/BoAmps).
 
 
 
app.py CHANGED
@@ -1,7 +1,8 @@
1
  import gradio as gr
2
  from os import path
3
- from src.services.huggingface import init_huggingface, update_dataset
4
  from src.services.json_generator import generate_json
 
5
  from src.ui.form_components import (
6
  create_header_tab,
7
  create_task_tab,
@@ -19,22 +20,148 @@ init_huggingface()
19
 
20
 
21
  def handle_submit(*inputs):
22
- message, file_output, json_output = generate_json(*inputs)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  # Check if the message indicates validation failure
25
- if message.startswith("The following fields are required"):
26
- return message, file_output, json_output
 
 
27
 
28
  publish_button = gr.Button(
29
  "Share your data to the public repository", interactive=True, elem_classes="pubbutton")
30
 
31
- return "Report sucessefully created", file_output, json_output, publish_button
 
32
 
 
 
 
 
 
33
 
34
- def handle_publi(json_output):
35
- # If validation passed, proceed to update_dataset
36
- update_output = update_dataset(json_output)
37
- return update_output
 
38
 
39
 
40
  # Create Gradio interface
@@ -57,30 +184,49 @@ with gr.Blocks(css_paths=css_path) as app:
57
  submit_button = gr.Button("Submit", elem_classes="subbutton")
58
  output = gr.Textbox(label="Output", lines=1)
59
  json_output = gr.Textbox(visible=False)
60
- file_output = gr.File(label="Downloadable JSON")
61
  publish_button = gr.Button(
62
  "Share your data to the public repository", interactive=False, elem_classes="pubbutton")
63
 
64
- # Event Handlers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  submit_button.click(
66
  handle_submit,
67
- inputs=[
68
- *header_components,
69
- *task_components,
70
- *measures_components,
71
- *system_components,
72
- *software_components,
73
- *infrastructure_components,
74
- *environment_components,
75
- *quality_components,
76
- ],
77
- outputs=[output, file_output, json_output, publish_button]
78
  )
79
  # Event Handlers
80
  publish_button.click(
81
  handle_publi,
82
  inputs=[
83
- json_output
84
  ],
85
  outputs=[output]
86
  )
 
1
  import gradio as gr
2
  from os import path
3
+ from src.services.dataset_upload import init_huggingface, update_dataset
4
  from src.services.json_generator import generate_json
5
+ from src.services.form_parser import form_parser
6
  from src.ui.form_components import (
7
  create_header_tab,
8
  create_task_tab,
 
20
 
21
 
22
  def handle_submit(*inputs):
23
+ """Handle form submission with optimized parsing."""
24
+ try:
25
+ # Parse inputs using the structured parser
26
+ parsed_data = form_parser.parse_inputs(inputs)
27
+
28
+ # Extract data for generate_json function
29
+ header_params = list(parsed_data["header"].values())
30
+
31
+ # Task data
32
+ task_simple = parsed_data["task_simple"]
33
+ taskFamily, taskStage, nbRequest = task_simple[
34
+ "taskFamily"], task_simple["taskStage"], task_simple["nbRequest"]
35
+
36
+ # Dynamic sections - algorithm data
37
+ algorithms = parsed_data["algorithms"]
38
+ trainingType = algorithms["trainingType"]
39
+ algorithmType = algorithms["algorithmType"]
40
+ algorithmName = algorithms["algorithmName"]
41
+ algorithmUri = algorithms["algorithmUri"]
42
+ foundationModelName = algorithms["foundationModelName"]
43
+ foundationModelUri = algorithms["foundationModelUri"]
44
+ parametersNumber = algorithms["parametersNumber"]
45
+ framework = algorithms["framework"]
46
+ frameworkVersion = algorithms["frameworkVersion"]
47
+ classPath = algorithms["classPath"]
48
+ layersNumber = algorithms["layersNumber"]
49
+ epochsNumber = algorithms["epochsNumber"]
50
+ optimizer = algorithms["optimizer"]
51
+ quantization = algorithms["quantization"]
52
+
53
+ # Dynamic sections - dataset data
54
+ dataset = parsed_data["dataset"]
55
+ dataUsage = dataset["dataUsage"]
56
+ dataType = dataset["dataType"]
57
+ dataFormat = dataset["dataFormat"]
58
+ dataSize = dataset["dataSize"]
59
+ dataQuantity = dataset["dataQuantity"]
60
+ shape = dataset["shape"]
61
+ source = dataset["source"]
62
+ sourceUri = dataset["sourceUri"]
63
+ owner = dataset["owner"]
64
+
65
+ # Task final data
66
+ task_final = parsed_data["task_final"]
67
+ measuredAccuracy, estimatedAccuracy, taskDescription = task_final[
68
+ "measuredAccuracy"], task_final["estimatedAccuracy"], task_final["taskDescription"]
69
+
70
+ # Measures data
71
+ measures = parsed_data["measures"]
72
+ measurementMethod = measures["measurementMethod"]
73
+ manufacturer = measures["manufacturer"]
74
+ version = measures["version"]
75
+ cpuTrackingMode = measures["cpuTrackingMode"]
76
+ gpuTrackingMode = measures["gpuTrackingMode"]
77
+ averageUtilizationCpu = measures["averageUtilizationCpu"]
78
+ averageUtilizationGpu = measures["averageUtilizationGpu"]
79
+ powerCalibrationMeasurement = measures["powerCalibrationMeasurement"]
80
+ durationCalibrationMeasurement = measures["durationCalibrationMeasurement"]
81
+ powerConsumption = measures["powerConsumption"]
82
+ measurementDuration = measures["measurementDuration"]
83
+ measurementDateTime = measures["measurementDateTime"]
84
+
85
+ # System data
86
+ system = parsed_data["system"]
87
+ osystem, distribution, distributionVersion = system[
88
+ "osystem"], system["distribution"], system["distributionVersion"]
89
+
90
+ # Software data
91
+ software = parsed_data["software"]
92
+ language, version_software = software["language"], software["version_software"]
93
+
94
+ # Infrastructure data
95
+ infra_simple = parsed_data["infrastructure_simple"]
96
+ infraType, cloudProvider, cloudInstance, cloudService = infra_simple["infraType"], infra_simple[
97
+ "cloudProvider"], infra_simple["cloudInstance"], infra_simple["cloudService"]
98
+
99
+ # Infrastructure components
100
+ infra_components = parsed_data["infrastructure_components"]
101
+ componentName = infra_components["componentName"]
102
+ componentType = infra_components["componentType"]
103
+ nbComponent = infra_components["nbComponent"]
104
+ memorySize = infra_components["memorySize"]
105
+ manufacturer_infra = infra_components["manufacturer_infra"]
106
+ family = infra_components["family"]
107
+ series = infra_components["series"]
108
+ share = infra_components["share"]
109
+
110
+ # Environment data
111
+ environment = parsed_data["environment"]
112
+ country, latitude, longitude, location, powerSupplierType, powerSource, powerSourceCarbonIntensity = environment["country"], environment["latitude"], environment[
113
+ "longitude"], environment["location"], environment["powerSupplierType"], environment["powerSource"], environment["powerSourceCarbonIntensity"]
114
+
115
+ # Quality data
116
+ quality = parsed_data["quality"]["quality"]
117
+
118
+ # Call generate_json with structured parameters
119
+ message, file_path, json_output = generate_json(
120
+ *header_params,
121
+ taskFamily, taskStage, nbRequest,
122
+ trainingType, algorithmType, algorithmName, algorithmUri, foundationModelName, foundationModelUri, parametersNumber, framework, frameworkVersion, classPath, layersNumber, epochsNumber, optimizer, quantization,
123
+ dataUsage, dataType, dataFormat, dataSize, dataQuantity, shape, source, sourceUri, owner,
124
+ measuredAccuracy, estimatedAccuracy, taskDescription,
125
+ measurementMethod, manufacturer, version, cpuTrackingMode, gpuTrackingMode,
126
+ averageUtilizationCpu, averageUtilizationGpu, powerCalibrationMeasurement,
127
+ durationCalibrationMeasurement, powerConsumption,
128
+ measurementDuration, measurementDateTime,
129
+ osystem, distribution, distributionVersion,
130
+ language, version_software,
131
+ infraType, cloudProvider, cloudInstance, cloudService, componentName, componentType,
132
+ nbComponent, memorySize, manufacturer_infra, family,
133
+ series, share,
134
+ country, latitude, longitude, location,
135
+ powerSupplierType, powerSource, powerSourceCarbonIntensity,
136
+ quality
137
+ )
138
+
139
+ except Exception as e:
140
+ return f"Error: {e}", None, "", gr.Button("Share your data to the public repository", interactive=False, elem_classes="pubbutton")
141
 
142
  # Check if the message indicates validation failure
143
+ if message.startswith("The json file does not correspond"):
144
+ publish_button = gr.Button(
145
+ "Share your data to the public repository", interactive=False, elem_classes="pubbutton")
146
+ return message, file_path, json_output, publish_button
147
 
148
  publish_button = gr.Button(
149
  "Share your data to the public repository", interactive=True, elem_classes="pubbutton")
150
 
151
+ return "Report sucessefully created", file_path, json_output, publish_button
152
+
153
 
154
+ def handle_publi(file_path, json_output):
155
+ """Handle publication to Hugging Face dataset with improved error handling."""
156
+ try:
157
+ if not file_path or not json_output:
158
+ return "Error: No file or data to publish."
159
 
160
+ # If validation passed, proceed to update_dataset
161
+ update_output = update_dataset(file_path, json_output)
162
+ return update_output
163
+ except Exception as e:
164
+ return f"Error during publication: {str(e)}"
165
 
166
 
167
  # Create Gradio interface
 
184
  submit_button = gr.Button("Submit", elem_classes="subbutton")
185
  output = gr.Textbox(label="Output", lines=1)
186
  json_output = gr.Textbox(visible=False)
187
+ json_file = gr.File(label="Downloadable JSON")
188
  publish_button = gr.Button(
189
  "Share your data to the public repository", interactive=False, elem_classes="pubbutton")
190
 
191
+ # Event Handlers - Optimized input flattening
192
+ def flatten_inputs(components):
193
+ """
194
+ Recursively flatten nested lists of components with improved performance.
195
+ Uses iterative approach and generator expressions for better memory efficiency.
196
+ """
197
+ result = []
198
+ stack = list(reversed(components)) # Use stack to avoid recursion
199
+
200
+ while stack:
201
+ item = stack.pop()
202
+ if isinstance(item, list):
203
+ # Add items in reverse order to maintain original sequence
204
+ stack.extend(reversed(item))
205
+ else:
206
+ result.append(item)
207
+
208
+ return result
209
+
210
+ all_inputs = flatten_inputs(header_components + task_components + measures_components +
211
+ system_components + software_components + infrastructure_components +
212
+ environment_components + quality_components)
213
+
214
+ # Validate input count matches expected structure
215
+ expected_count = form_parser.get_total_input_count()
216
+ if len(all_inputs) != expected_count:
217
+ print(
218
+ f"Warning: Input count mismatch. Expected {expected_count}, got {len(all_inputs)}")
219
+
220
  submit_button.click(
221
  handle_submit,
222
+ inputs=all_inputs,
223
+ outputs=[output, json_file, json_output, publish_button]
 
 
 
 
 
 
 
 
 
224
  )
225
  # Event Handlers
226
  publish_button.click(
227
  handle_publi,
228
  inputs=[
229
+ json_file, json_output
230
  ],
231
  outputs=[output]
232
  )
assets/utils/validation.py CHANGED
@@ -1,33 +1,74 @@
1
- from src.services.util import OBLIGATORY_FIELDS
2
-
3
-
4
- def validate_obligatory_fields(data):
5
- """Validate that all required fields are present in the data."""
6
- def find_field(d, field):
7
- if field in d:
8
- return d[field]
9
- for k, v in d.items():
10
- if isinstance(v, dict):
11
- result = find_field(v, field)
12
- if result is not None:
13
- return result
14
- elif isinstance(v, list):
15
- for item in v:
16
- if isinstance(item, dict):
17
- result = find_field(item, field)
18
- if result is not None:
19
- return result
20
  return None
21
 
22
- missing_fields = []
23
 
24
- for field in OBLIGATORY_FIELDS:
25
- # if the field is mandatory, check if it is inside a mandatory section
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
- value = find_field(data, field)
28
- if not value and value != 0: # Allow 0 as a valid value
29
- missing_fields.append(field)
 
 
 
30
 
31
- if missing_fields:
32
- return False, f"The following fields are required: {', '.join(missing_fields)}"
33
- return True, "All required fields are filled."
 
1
+ import json
2
+ from referencing import Registry, Resource
3
+ from jsonschema import Draft202012Validator
4
+ import requests
5
+
6
+
7
+ def fetch_json_from_url(url: str):
8
+ """Fetch JSON content from a GitHub raw URL"""
9
+ try:
10
+ response = requests.get(url, timeout=10)
11
+ response.raise_for_status()
12
+ return response.json()
13
+ except (requests.exceptions.RequestException, json.JSONDecodeError) as e:
14
+ print(f"Error fetching/parsing {url}: {e}")
 
 
 
 
 
15
  return None
16
 
 
17
 
18
+ # GitHub URLs for the schemas
19
+ SCHEMA_URLS = {
20
+ "algorithm": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/algorithm_schema.json",
21
+ "dataset": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/dataset_schema.json",
22
+ "measure": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/measure_schema.json",
23
+ "hardware": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/hardware_schema.json",
24
+ "report": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/report_schema.json"
25
+ }
26
+
27
+
28
+ def load_schemas():
29
+ """Load all schemas from GitHub URLs"""
30
+ schemas = {}
31
+ for name, url in SCHEMA_URLS.items():
32
+ schemas[name] = fetch_json_from_url(url)
33
+ return schemas
34
+
35
+
36
+ def create_registry(schemas):
37
+ """Create a registry with all sub-schemas"""
38
+ sub_schema_names = ["algorithm", "dataset", "measure", "hardware"]
39
+ resources = [
40
+ (SCHEMA_URLS[name], Resource.from_contents(schemas[name]))
41
+ for name in sub_schema_names
42
+ ]
43
+ return Registry().with_resources(resources)
44
+
45
+
46
+ # Load schemas once at module import
47
+ _schemas = load_schemas()
48
+ _registry = create_registry(_schemas)
49
+
50
+
51
+ def validate_boamps_schema(instance):
52
+ """Validate instance against BoAmps report schema"""
53
+ # Create validator using pre-loaded schemas and registry
54
+ validator = Draft202012Validator(_schemas["report"], registry=_registry)
55
+
56
+ # Validate
57
+ if validator.is_valid(instance):
58
+ return True, "All required fields are filled & your report has the right format!"
59
+
60
+ # Build error message
61
+ errors = list(validator.iter_errors(instance))
62
+ error_lines = [
63
+ f"The json file does not correspond to the schema, there are {len(errors)} errors:\n",
64
+ "-" * 50
65
+ ]
66
 
67
+ for err in errors:
68
+ error_lines.extend([
69
+ f"Error on data: {err.json_path}",
70
+ f" --> {err.message}",
71
+ "-" * 50
72
+ ])
73
 
74
+ return False, "\n".join(error_lines)
 
 
src/services/dataset_upload.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from huggingface_hub import HfApi, login
2
+ from src.services.util import HF_TOKEN, DATASET_NAME
3
+ import os
4
+
5
+
6
+ def init_huggingface():
7
+ """Initialize Hugging Face authentication."""
8
+ if HF_TOKEN is None:
9
+ raise ValueError(
10
+ "Hugging Face token not found in environment variables.")
11
+ login(token=HF_TOKEN)
12
+
13
+
14
+ def update_dataset(file_path, json_data):
15
+ """Update the Hugging Face dataset with new data."""
16
+
17
+ if json_data is None or json_data.startswith("The following fields are required"):
18
+ return json_data or "No data to submit. Please fill in all required fields."
19
+ try:
20
+ # Initialize Hugging Face authentication
21
+ init_huggingface()
22
+ api = HfApi()
23
+
24
+ short_filename = os.path.basename(file_path)
25
+
26
+ api.upload_file(
27
+ path_or_fileobj=file_path,
28
+ repo_id=DATASET_NAME,
29
+ path_in_repo=f"data/{short_filename}",
30
+ repo_type="dataset",
31
+ commit_message=f"Add new BoAmps report data: {short_filename}",
32
+ create_pr=True,
33
+ )
34
+
35
+ except Exception as e:
36
+ return f"Error updating dataset: {str(e)}"
37
+ return "Data submitted successfully and dataset updated! Consult the data here: https://huggingface.co/datasets/boavizta/open_data_boamps"
src/services/form_parser.py ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Form parser configuration and utilities for handling Gradio form inputs.
3
+ This module provides a centralized way to manage form structure and parsing.
4
+ """
5
+
6
+ from dataclasses import dataclass
7
+ from typing import List, Any, Tuple
8
+
9
+
10
+ @dataclass
11
+ class FormSection:
12
+ """Represents a section of the form with its field count."""
13
+ name: str
14
+ field_count: int
15
+ fields: List[str] = None
16
+
17
+
18
+ @dataclass
19
+ class DynamicSection:
20
+ """Represents a dynamic section with multiple rows and fields."""
21
+ name: str
22
+ fields: List[str]
23
+ max_rows: int = 5
24
+
25
+ @property
26
+ def total_components(self) -> int:
27
+ return len(self.fields) * self.max_rows
28
+
29
+
30
+ # Form structure configuration
31
+ FORM_STRUCTURE = [
32
+ FormSection("header", 11, [
33
+ "licensing", "formatVersion", "formatVersionSpecificationUri", "reportId",
34
+ "reportDatetime", "reportStatus", "publisher_name", "publisher_division",
35
+ "publisher_projectName", "publisher_confidentialityLevel", "publisher_publicKey"
36
+ ]),
37
+
38
+ FormSection("task_simple", 3, [
39
+ "taskFamily", "taskStage", "nbRequest"
40
+ ]),
41
+
42
+ DynamicSection("algorithms", [
43
+ "trainingType", "algorithmType", "algorithmName", "algorithmUri",
44
+ "foundationModelName", "foundationModelUri", "parametersNumber", "framework",
45
+ "frameworkVersion", "classPath", "layersNumber", "epochsNumber", "optimizer", "quantization"
46
+ ]),
47
+
48
+ DynamicSection("dataset", [
49
+ "dataUsage", "dataType", "dataFormat", "dataSize", "dataQuantity",
50
+ "shape", "source", "sourceUri", "owner"
51
+ ]),
52
+
53
+ FormSection("task_final", 3, [
54
+ "measuredAccuracy", "estimatedAccuracy", "taskDescription"
55
+ ]),
56
+
57
+ DynamicSection("measures", [
58
+ "measurementMethod", "manufacturer", "version", "cpuTrackingMode", "gpuTrackingMode",
59
+ "averageUtilizationCpu", "averageUtilizationGpu", "powerCalibrationMeasurement",
60
+ "durationCalibrationMeasurement", "powerConsumption", "measurementDuration", "measurementDateTime"
61
+ ]),
62
+
63
+ FormSection("system", 3, [
64
+ "osystem", "distribution", "distributionVersion"
65
+ ]),
66
+
67
+ FormSection("software", 2, [
68
+ "language", "version_software"
69
+ ]),
70
+
71
+ FormSection("infrastructure_simple", 4, [
72
+ "infraType", "cloudProvider", "cloudInstance", "cloudService"
73
+ ]),
74
+
75
+ DynamicSection("infrastructure_components", [
76
+ "componentName", "componentType", "nbComponent", "memorySize",
77
+ "manufacturer_infra", "family", "series", "share"
78
+ ]),
79
+
80
+ FormSection("environment", 7, [
81
+ "country", "latitude", "longitude", "location",
82
+ "powerSupplierType", "powerSource", "powerSourceCarbonIntensity"
83
+ ]),
84
+
85
+ FormSection("quality", 1, ["quality"])
86
+ ]
87
+
88
+
89
+ class FormParser:
90
+ """Utility class for parsing form inputs based on the form structure."""
91
+
92
+ def __init__(self):
93
+ self.structure = FORM_STRUCTURE
94
+
95
+ def parse_inputs(self, inputs: Tuple[Any, ...]) -> dict:
96
+ """
97
+ Parse form inputs into a structured dictionary.
98
+
99
+ Args:
100
+ inputs: Tuple of all form input values
101
+
102
+ Returns:
103
+ dict: Parsed form data organized by sections
104
+ """
105
+ parsed_data = {}
106
+ idx = 0
107
+
108
+ for section in self.structure:
109
+ if isinstance(section, FormSection):
110
+ # Simple section - extract values directly
111
+ section_data = inputs[idx:idx + section.field_count]
112
+ if section.fields:
113
+ parsed_data[section.name] = dict(
114
+ zip(section.fields, section_data))
115
+ else:
116
+ parsed_data[section.name] = section_data
117
+ idx += section.field_count
118
+
119
+ elif isinstance(section, DynamicSection):
120
+ # Dynamic section - extract and reshape data
121
+ flat_data = inputs[idx:idx + section.total_components]
122
+ idx += section.total_components
123
+
124
+ # Reshape flat data into field-organized lists
125
+ section_data = {}
126
+ for field_idx, field_name in enumerate(section.fields):
127
+ start_pos = field_idx * section.max_rows
128
+ end_pos = start_pos + section.max_rows
129
+ section_data[field_name] = flat_data[start_pos:end_pos]
130
+
131
+ parsed_data[section.name] = section_data
132
+
133
+ return parsed_data
134
+
135
+ def get_total_input_count(self) -> int:
136
+ """Get the total number of expected inputs."""
137
+ total = 0
138
+ for section in self.structure:
139
+ if isinstance(section, FormSection):
140
+ total += section.field_count
141
+ elif isinstance(section, DynamicSection):
142
+ total += section.total_components
143
+ return total
144
+
145
+
146
+ # Global parser instance
147
+ form_parser = FormParser()
src/services/huggingface.py DELETED
@@ -1,244 +0,0 @@
1
- from huggingface_hub import login
2
- from datasets import load_dataset, Dataset, concatenate_datasets
3
- import json
4
- from src.services.util import HF_TOKEN, DATASET_NAME
5
-
6
-
7
- def init_huggingface():
8
- """Initialize Hugging Face authentication."""
9
- if HF_TOKEN is None:
10
- raise ValueError(
11
- "Hugging Face token not found in environment variables.")
12
- login(token=HF_TOKEN)
13
-
14
-
15
- def update_dataset(json_data):
16
- """Update the Hugging Face dataset with new data."""
17
- if json_data is None or json_data.startswith("The following fields are required"):
18
- return json_data or "No data to submit. Please fill in all required fields."
19
-
20
- try:
21
- data = json.loads(json_data)
22
- except json.JSONDecodeError:
23
- return "Invalid JSON data. Please ensure all required fields are filled correctly."
24
-
25
- try:
26
- dataset = load_dataset(DATASET_NAME, split="train")
27
- print(dataset)
28
- except:
29
- dataset = Dataset.from_dict({})
30
-
31
- new_data = create_flattened_data(data)
32
- new_dataset = Dataset.from_dict(new_data)
33
-
34
- if len(dataset) > 0:
35
- print("dataset intitial")
36
- print(dataset)
37
- print("data to add ")
38
- print(new_dataset)
39
- updated_dataset = concatenate_datasets([dataset, new_dataset])
40
- else:
41
- updated_dataset = new_dataset
42
-
43
- updated_dataset.push_to_hub(DATASET_NAME)
44
- return "Data submitted successfully and dataset updated! Consult the data [here](https://huggingface.co/datasets/boavizta/BoAmps_data)"
45
-
46
-
47
- def create_flattened_data(data):
48
- """Create a flattened data structure for the algorithms."""
49
- # Handle algorithms
50
- algorithms = data.get("task", {}).get("algorithms", [])
51
- fields = ["trainingType", "algorithmType", "algorithmName", "algorithmUri", "foundationModelName", "foundationModelUri",
52
- "parametersNumber", "framework", "frameworkVersion", "classPath", "layersNumber", "epochsNumber", "optimizer", "quantization"]
53
- """Create a flattened data structure for the algorithms."""
54
- algorithms_data = {field: "| ".join(str(algo.get(
55
- field)) for algo in algorithms if algo.get(field)) or "" for field in fields}
56
- trainingType_str = algorithms_data["trainingType"]
57
- algorithmType_str = algorithms_data["algorithmType"]
58
- algorithmName_str = algorithms_data["algorithmName"]
59
- algorithmUri_str = algorithms_data["algorithmUri"]
60
- foundationModelName_str = algorithms_data["foundationModelName"]
61
- foundationModelUri_str = algorithms_data["foundationModelUri"]
62
- parametersNumber_str = algorithms_data["parametersNumber"]
63
- framework_str = algorithms_data["framework"]
64
- frameworkVersion_str = algorithms_data["frameworkVersion"]
65
- classPath_str = algorithms_data["classPath"]
66
- layersNumber_str = algorithms_data["layersNumber"]
67
- epochsNumber_str = algorithms_data["epochsNumber"]
68
- optimizer_str = algorithms_data["optimizer"]
69
- quantization_str = algorithms_data["quantization"]
70
-
71
- """Create a flattened data structure for the dataset."""
72
- # Handle dataset
73
- dataset = data.get("task", {}).get("dataset", [])
74
- fields = ["dataUsage", "dataType", "dataFormat", "dataSize",
75
- "dataQuantity", "shape", "source", "sourceUri", "owner"]
76
- """Create a flattened data structure for the dataset."""
77
- dataset_data = {field: "| ".join(
78
- str(d.get(field)) for d in dataset if d.get(field)) or "" for field in fields}
79
- dataUsage_str = dataset_data["dataUsage"]
80
- dataType_str = dataset_data["dataType"]
81
- dataFormat_str = dataset_data["dataFormat"]
82
- dataSize_str = dataset_data["dataSize"]
83
- dataQuantity_str = dataset_data["dataQuantity"]
84
- shape_str = dataset_data["shape"]
85
- source_str = dataset_data["source"]
86
- sourceUri_str = dataset_data["sourceUri"]
87
- owner_str = dataset_data["owner"]
88
-
89
- """Create a flattened data structure for the measures."""
90
- # Handle measures
91
- measures = data.get("measures", [])
92
- fields = ["measurementMethod", "manufacturer", "version", "cpuTrackingMode", "gpuTrackingMode", "averageUtilizationCpu", "averageUtilizationGpu",
93
- "powerCalibrationMeasurement", "durationCalibrationMeasurement", "powerConsumption", "measurementDuration", "measurementDateTime"]
94
- """Create a flattened data structure for the measures."""
95
- measures_data = {field: "| ".join(str(measure.get(
96
- field)) for measure in measures if measure.get(field)) or "" for field in fields}
97
- measurementMethod_str = measures_data["measurementMethod"]
98
- manufacturer_str = measures_data["manufacturer"]
99
- version_str = measures_data["version"]
100
- cpuTrackingMode_str = measures_data["cpuTrackingMode"]
101
- gpuTrackingMode_str = measures_data["gpuTrackingMode"]
102
- averageUtilizationCpu_str = measures_data["averageUtilizationCpu"]
103
- averageUtilizationGpu_str = measures_data["averageUtilizationGpu"]
104
- powerCalibrationMeasurement_str = measures_data["powerCalibrationMeasurement"]
105
- durationCalibrationMeasurement_str = measures_data["durationCalibrationMeasurement"]
106
- powerConsumption_str = measures_data["powerConsumption"]
107
- measurementDuration_str = measures_data["measurementDuration"]
108
- measurementDateTime_str = measures_data["measurementDateTime"]
109
-
110
- # Handle components
111
- components = data.get("infrastructure", {}).get("components", [])
112
- fields = ["componentName", "componentType", "nbComponent", "memorySize",
113
- "manufacturer", "family", "series", "share"]
114
-
115
- # Generate concatenated strings for each field
116
- component_data = {field: "| ".join(str(comp.get(
117
- field)) for comp in components if comp.get(field)) or "" for field in fields}
118
-
119
- componentName_str = component_data["componentName"]
120
- componentType_str = component_data["componentType"]
121
- nbComponent_str = component_data["nbComponent"]
122
- memorySize_str = component_data["memorySize"]
123
- manufacturer_infra_str = component_data["manufacturer"]
124
- family_str = component_data["family"]
125
- series_str = component_data["series"]
126
- share_str = component_data["share"]
127
-
128
- return {
129
- # Header
130
- "licensing": [data.get("header", {}).get("licensing", "")],
131
- "formatVersion": [data.get("header", {}).get("formatVersion", "")],
132
- "formatVersionSpecificationUri": [data.get("header", {}).get("formatVersionSpecificationUri", "")],
133
- "reportId": [data.get("header", {}).get("reportId", "")],
134
- "reportDatetime": [data.get("header", {}).get("reportDatetime", "")],
135
- "reportStatus": [data.get("header", {}).get("reportStatus", "")],
136
- "publisher_name": [data.get("header", {}).get("publisher", {}).get("name", "")],
137
- "publisher_division": [data.get("header", {}).get("publisher", {}).get("division", "")],
138
- "publisher_projectName": [data.get("header", {}).get("publisher", {}).get("projectName", "")],
139
- "publisher_confidentialityLevel": [data.get("header", {}).get("publisher", {}).get("confidentialityLevel", "")],
140
- "publisher_publicKey": [data.get("header", {}).get("publisher", {}).get("publicKey", "")],
141
-
142
- # Task
143
- "taskStage": [data.get("task", {}).get("taskStage", "")],
144
- "taskFamily": [data.get("task", {}).get("taskFamily", "")],
145
- "nbRequest": [data.get("task", {}).get("nbRequest", "")],
146
- # Algorithms
147
- "trainingType": [trainingType_str],
148
- "algorithmType": [algorithmType_str],
149
- "algorithmName": [algorithmName_str],
150
- "algorithmUri": [algorithmUri_str],
151
- "foundationModelName": [foundationModelName_str],
152
- "foundationModelUri": [foundationModelUri_str],
153
- "parametersNumber": [parametersNumber_str],
154
- "framework": [framework_str],
155
- "frameworkVersion": [frameworkVersion_str],
156
- "classPath": [classPath_str],
157
- "layersNumber": [layersNumber_str],
158
- "epochsNumber": [epochsNumber_str],
159
- "optimizer": [optimizer_str],
160
- "quantization": [quantization_str],
161
- # Dataset
162
- "dataUsage": [dataUsage_str],
163
- "dataType": [dataType_str],
164
- "dataFormat": [dataFormat_str],
165
- "dataSize": [dataSize_str],
166
- "dataQuantity": [dataQuantity_str],
167
- "shape": [shape_str],
168
- "source": [source_str],
169
- "sourceUri": [sourceUri_str],
170
- "owner": [owner_str],
171
- "measuredAccuracy": [data.get("task", {}).get("measuredAccuracy", "")],
172
- "estimatedAccuracy": [data.get("task", {}).get("estimatedAccuracy", "")],
173
- "taskDescription": [data.get("task", {}).get("taskDescription", "")],
174
-
175
- # Measures
176
- "measurementMethod": [measurementMethod_str],
177
- "manufacturer": [manufacturer_str],
178
- "version": [version_str],
179
- "cpuTrackingMode": [cpuTrackingMode_str],
180
- "gpuTrackingMode": [gpuTrackingMode_str],
181
- "averageUtilizationCpu": [averageUtilizationCpu_str],
182
- "averageUtilizationGpu": [averageUtilizationGpu_str],
183
- "powerCalibrationMeasurement": [powerCalibrationMeasurement_str],
184
- "durationCalibrationMeasurement": [durationCalibrationMeasurement_str],
185
- "powerConsumption": [powerConsumption_str],
186
- "measurementDuration": [measurementDuration_str],
187
- "measurementDateTime": [measurementDateTime_str],
188
-
189
- # System
190
- "os": [data.get("system", {}).get("os", "")],
191
- "distribution": [data.get("system", {}).get("distribution", "")],
192
- "distributionVersion": [data.get("system", {}).get("distributionVersion", "")],
193
-
194
- # Software
195
- "language": [data.get("software", {}).get("language", "")],
196
- "version_software": [data.get("software", {}).get("version_software", "")],
197
-
198
- # Infrastructure
199
- "infraType": [data.get("infrastructure", {}).get("infra_type", "")],
200
- "cloudProvider": [data.get("infrastructure", {}).get("cloudProvider", "")],
201
- "cloudInstance": [data.get("infrastructure", {}).get("cloudInstance", "")],
202
- "cloudService": [data.get("infrastructure", {}).get("cloudService", "")],
203
- "componentName": [componentName_str],
204
- "componentType": [componentType_str],
205
- "nbComponent": [nbComponent_str],
206
- "memorySize": [memorySize_str],
207
- "manufacturer_infra": [manufacturer_infra_str],
208
- "family": [family_str],
209
- "series": [series_str],
210
- "share": [share_str],
211
-
212
- # Environment
213
- "country": [data.get("environment", {}).get("country", "")],
214
- "latitude": [data.get("environment", {}).get("latitude", "")],
215
- "longitude": [data.get("environment", {}).get("longitude", "")],
216
- "location": [data.get("environment", {}).get("location", "")],
217
- "powerSupplierType": [data.get("environment", {}).get("powerSupplierType", "")],
218
- "powerSource": [data.get("environment", {}).get("powerSource", "")],
219
- "powerSourceCarbonIntensity": [data.get("environment", {}).get("powerSourceCarbonIntensity", "")],
220
-
221
- # Quality
222
- "quality": [data.get("quality", "")],
223
- }
224
-
225
-
226
- """
227
- def create_flattened_data(data):
228
- out = {}
229
-
230
- def flatten(x, name=''):
231
- if type(x) is dict:
232
- for a in x:
233
- flatten(x[a], name + a + '_')
234
- elif type(x) is list:
235
- i = 0
236
- for a in x:
237
- flatten(a, name + str(i) + '_')
238
- i += 1
239
- else:
240
- out[name[:-1]] = x
241
-
242
- flatten(data)
243
- return out
244
- """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/services/json_generator.py CHANGED
@@ -1,7 +1,67 @@
1
  import json
2
  import tempfile
3
  from datetime import datetime
4
- from assets.utils.validation import validate_obligatory_fields
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
 
7
  def generate_json(
@@ -20,7 +80,7 @@ def generate_json(
20
  durationCalibrationMeasurement, powerConsumption,
21
  measurementDuration, measurementDateTime,
22
  # System
23
- os, distribution, distributionVersion,
24
  # Software
25
  language, version_software,
26
  # Infrastructure
@@ -33,192 +93,154 @@ def generate_json(
33
  # Quality
34
  quality
35
  ):
36
- """Generate JSON data from form inputs."""
37
- # Process algorithms
38
- algorithms_list = []
39
- algorithm_fields = {"trainingType": trainingType, "algorithmType": algorithmType, "algorithmName": algorithmName, "algorithmUri": algorithmUri, "foundationModelName": foundationModelName, "foundationModelUri": foundationModelUri,
40
- "parametersNumber": parametersNumber, "framework": framework, "frameworkVersion": frameworkVersion, "classPath": classPath, "layersNumber": layersNumber, "epochsNumber": epochsNumber, "optimizer": optimizer, "quantization": quantization}
41
- nb_algo = 0
42
- # ça ça marche pas
43
- for f in algorithm_fields:
44
- nb_algo = max(nb_algo, len(algorithm_fields[f]))
45
- for i in range(nb_algo):
46
- algortithm = {}
47
- for f in algorithm_fields:
48
- if i < len(algorithm_fields[f]) and algorithm_fields[f][i]:
49
- algortithm[f] = algorithm_fields[f][i]
50
- algorithms_list.append(algortithm)
51
-
52
- # Process dataset
53
- dataset_list = []
54
- dataset_fields = {"dataUsage": dataUsage, "dataType": dataType, "dataFormat": dataFormat, "dataSize": dataSize,
55
- "dataQuantity": dataQuantity, "shape": shape, "source": source, "sourceUri": sourceUri, "owner": owner}
56
- nb_data = 0
57
- for f in dataset_fields:
58
- nb_data = max(nb_data, len(dataset_fields[f]))
59
- for i in range(nb_data):
60
- data = {}
61
- for f in dataset_fields:
62
- if i < len(dataset_fields[f]) and dataset_fields[f][i]:
63
- data[f] = dataset_fields[f][i]
64
- dataset_list.append(data)
65
-
66
- # Process measures
67
- measures_list = []
68
- measure_fields = {"measurementMethod": measurementMethod, "manufacturer": manufacturer, "version": version, "cpuTrackingMode": cpuTrackingMode,
69
- "gpuTrackingMode": gpuTrackingMode, "averageUtilizationCpu": averageUtilizationCpu, "averageUtilizationGpu": averageUtilizationGpu,
70
- "powerCalibrationMeasurement": powerCalibrationMeasurement, "durationCalibrationMeasurement": durationCalibrationMeasurement,
71
- "powerConsumption": powerConsumption, "measurementDuration": measurementDuration, "measurementDateTime": measurementDateTime}
72
- nb_measures = 0
73
- for f in measure_fields:
74
- nb_measures = max(nb_measures, len(measure_fields[f]))
75
- for i in range(nb_measures):
76
- measure = {}
77
- for f in measure_fields:
78
- if i < len(measure_fields[f]) and measure_fields[f][i]:
79
- measure[f] = measure_fields[f][i]
80
- measures_list.append(measure)
81
-
82
- # Process components
83
- components_list = []
84
- component_fields = {"componentName": componentName, "componentType": componentType, "nbComponent": nbComponent,
85
- "memorySize": memorySize, "manufacturer_infra": manufacturer_infra, "family": family,
86
- "series": series, "share": share}
87
- nb_components = 0
88
- for f in component_fields:
89
- nb_components = max(nb_components, len(component_fields[f]))
90
- for i in range(nb_components):
91
- component = {}
92
- for f in component_fields:
93
- if i < len(component_fields[f]) and component_fields[f][i]:
94
- component[f] = component_fields[f][i]
95
- components_list.append(component)
96
-
97
- # process report
98
- report = {}
99
-
100
- # Process header
101
- header = {}
102
- if licensing:
103
- header["licensing"] = licensing
104
- if formatVersion:
105
- header["formatVersion"] = formatVersion
106
- if formatVersionSpecificationUri:
107
- header["formatVersionSpecificationUri"] = formatVersionSpecificationUri
108
- if reportId:
109
- header["reportId"] = reportId
110
- if reportDatetime:
111
- header["reportDatetime"] = reportDatetime or datetime.now().isoformat()
112
- if reportStatus:
113
- header["reportStatus"] = reportStatus
114
-
115
- publisher = {}
116
- if publisher_name:
117
- publisher["name"] = publisher_name
118
- if publisher_division:
119
- publisher["division"] = publisher_division
120
- if publisher_projectName:
121
- publisher["projectName"] = publisher_projectName
122
- if publisher_confidentialityLevel:
123
- publisher["confidentialityLevel"] = publisher_confidentialityLevel
124
- if publisher_publicKey:
125
- publisher["publicKey"] = publisher_publicKey
126
-
127
- if publisher:
128
- header["publisher"] = publisher
129
-
130
- if header:
131
- report["header"] = header
132
-
133
- # proceed task
134
- task = {}
135
- if taskStage:
136
- task["taskStage"] = taskStage
137
- if taskFamily:
138
- task["taskFamily"] = taskFamily
139
- if nbRequest:
140
- task["nbRequest"] = nbRequest
141
- if algorithms_list:
142
- task["algorithms"] = algorithms_list
143
- if dataset_list:
144
- task["dataset"] = dataset_list
145
- if measuredAccuracy:
146
- task["measuredAccuracy"] = measuredAccuracy
147
- if estimatedAccuracy:
148
- task["estimatedAccuracy"] = estimatedAccuracy
149
- if taskDescription:
150
- task["taskDescription"] = taskDescription
151
- report["task"] = task
152
-
153
- # proceed measures
154
- if measures_list:
155
- report["measures"] = measures_list
156
-
157
- # proceed system
158
- system = {}
159
- if os:
160
- system["os"] = os
161
- if distribution:
162
- system["distribution"] = distribution
163
- if distributionVersion:
164
- system["distributionVersion"] = distributionVersion
165
- if system:
166
- report["system"] = system
167
-
168
- # proceed software
169
- software = {}
170
- if language:
171
- software["language"] = language
172
- if version_software:
173
- software["version"] = version_software
174
- if software:
175
- report["software"] = software
176
-
177
- # proceed infrastructure
178
- infrastructure = {}
179
- if infraType:
180
- infrastructure["infraType"] = infraType
181
- if cloudProvider:
182
- infrastructure["cloudProvider"] = cloudProvider
183
- if cloudInstance:
184
- infrastructure["cloudInstance"] = cloudInstance
185
- if cloudService:
186
- infrastructure["cloudService"] = cloudService
187
- if components_list:
188
- infrastructure["components"] = components_list
189
- report["infrastructure"] = infrastructure
190
-
191
- # proceed environment
192
- environment = {}
193
- if country:
194
- environment["country"] = country
195
- if latitude:
196
- environment["latitude"] = latitude
197
- if longitude:
198
- environment["longitude"] = longitude
199
- if location:
200
- environment["location"] = location
201
- if powerSupplierType:
202
- environment["powerSupplierType"] = powerSupplierType
203
- if powerSource:
204
- environment["powerSource"] = powerSource
205
- if powerSourceCarbonIntensity:
206
- environment["powerSourceCarbonIntensity"] = powerSourceCarbonIntensity
207
- if environment:
208
- report["environment"] = environment
209
-
210
- # proceed quality
211
- if quality:
212
- report["quality"] = quality
213
-
214
- # Validate obligatory fields
215
- is_valid, message = validate_obligatory_fields(report)
216
- if not is_valid:
217
- return message, None, ""
218
  # Create the JSON string
219
- json_str = json.dumps(report)
220
- print(json_str)
221
- # Create and save the JSON file
222
- with tempfile.NamedTemporaryFile(mode='w', prefix="report", delete=False, suffix='.json') as file:
223
- json.dump(report, file, indent=4)
224
- return message, file.name, json_str
 
 
 
 
 
 
 
1
  import json
2
  import tempfile
3
  from datetime import datetime
4
+ from assets.utils.validation import validate_boamps_schema
5
+ from src.services.report_builder import ReportBuilder
6
+ import os
7
+
8
+
9
+ def process_component_list(fields_dict):
10
+ """
11
+ Fonction générique pour traiter une liste de composants à partir d'un dictionnaire de champs.
12
+
13
+ Args:
14
+ fields_dict (dict): Dictionnaire où les clés sont les noms des champs
15
+ et les valeurs sont des listes de composants Gradio ou des objets gr.State.
16
+
17
+ Returns:
18
+ list: Liste de dictionnaires représentant les composants.
19
+ """
20
+ component_list = []
21
+
22
+ # Extract values from different input types
23
+ processed_fields = {}
24
+ for field_name, field_values in fields_dict.items():
25
+ if hasattr(field_values, 'value'): # It's a gr.State object
26
+ processed_fields[field_name] = field_values.value if field_values.value else [
27
+ ]
28
+ elif isinstance(field_values, list) and len(field_values) > 0:
29
+ # It's a list of Gradio components or values
30
+ values = []
31
+ for item in field_values:
32
+ if hasattr(item, '__class__') and 'gradio' in str(item.__class__):
33
+ # It's a Gradio component, the value was passed as input to this function
34
+ # We need to handle this in the calling function by passing the values directly
35
+ values.append(item if item is not None else "")
36
+ else:
37
+ # It's already a value
38
+ values.append(item if item is not None else "")
39
+ processed_fields[field_name] = values
40
+ else:
41
+ processed_fields[field_name] = field_values if field_values else []
42
+
43
+ # Trouver le nombre maximum d'éléments parmi tous les champs
44
+ max_items = 0
45
+ for field_values in processed_fields.values():
46
+ if field_values:
47
+ max_items = max(max_items, len(field_values))
48
+
49
+ # Créer les composants
50
+ for i in range(max_items):
51
+ component = {}
52
+
53
+ for field_name, field_values in processed_fields.items():
54
+ if i < len(field_values):
55
+ value = field_values[i]
56
+ # Only add the field if it has a meaningful value (not empty, not just whitespace)
57
+ if value is not None and str(value).strip() != "":
58
+ component[field_name] = value
59
+
60
+ # Add component if it has any field (as requested by user)
61
+ if component:
62
+ component_list.append(component)
63
+
64
+ return component_list
65
 
66
 
67
  def generate_json(
 
80
  durationCalibrationMeasurement, powerConsumption,
81
  measurementDuration, measurementDateTime,
82
  # System
83
+ osystem, distribution, distributionVersion,
84
  # Software
85
  language, version_software,
86
  # Infrastructure
 
93
  # Quality
94
  quality
95
  ):
96
+ """Generate JSON data from form inputs using optimized ReportBuilder."""
97
+
98
+ try:
99
+ # Use ReportBuilder for cleaner, more maintainable code
100
+ builder = ReportBuilder()
101
+
102
+ # Build header section
103
+ header_data = {
104
+ "licensing": licensing,
105
+ "formatVersion": formatVersion,
106
+ "formatVersionSpecificationUri": formatVersionSpecificationUri,
107
+ "reportId": reportId,
108
+ "reportDatetime": reportDatetime or datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
109
+ "reportStatus": reportStatus,
110
+ "publisher_name": publisher_name,
111
+ "publisher_division": publisher_division,
112
+ "publisher_projectName": publisher_projectName,
113
+ "publisher_confidentialityLevel": publisher_confidentialityLevel,
114
+ "publisher_publicKey": publisher_publicKey
115
+ }
116
+ builder.add_header(header_data)
117
+
118
+ # Build task section
119
+ task_data = {
120
+ "taskStage": taskStage,
121
+ "taskFamily": taskFamily,
122
+ "nbRequest": nbRequest,
123
+ "measuredAccuracy": measuredAccuracy,
124
+ "estimatedAccuracy": estimatedAccuracy,
125
+ "taskDescription": taskDescription,
126
+ "algorithms": {
127
+ "trainingType": trainingType,
128
+ "algorithmType": algorithmType,
129
+ "algorithmName": algorithmName,
130
+ "algorithmUri": algorithmUri,
131
+ "foundationModelName": foundationModelName,
132
+ "foundationModelUri": foundationModelUri,
133
+ "parametersNumber": parametersNumber,
134
+ "framework": framework,
135
+ "frameworkVersion": frameworkVersion,
136
+ "classPath": classPath,
137
+ "layersNumber": layersNumber,
138
+ "epochsNumber": epochsNumber,
139
+ "optimizer": optimizer,
140
+ "quantization": quantization
141
+ },
142
+ "dataset": {
143
+ "dataUsage": dataUsage,
144
+ "dataType": dataType,
145
+ "dataFormat": dataFormat,
146
+ "dataSize": dataSize,
147
+ "dataQuantity": dataQuantity,
148
+ "shape": shape,
149
+ "source": source,
150
+ "sourceUri": sourceUri,
151
+ "owner": owner
152
+ }
153
+ }
154
+ builder.add_task(task_data)
155
+
156
+ # Build measures section
157
+ measures_data = {
158
+ "measurementMethod": measurementMethod,
159
+ "manufacturer": manufacturer,
160
+ "version": version,
161
+ "cpuTrackingMode": cpuTrackingMode,
162
+ "gpuTrackingMode": gpuTrackingMode,
163
+ "averageUtilizationCpu": averageUtilizationCpu,
164
+ "averageUtilizationGpu": averageUtilizationGpu,
165
+ "powerCalibrationMeasurement": powerCalibrationMeasurement,
166
+ "durationCalibrationMeasurement": durationCalibrationMeasurement,
167
+ "powerConsumption": powerConsumption,
168
+ "measurementDuration": measurementDuration,
169
+ "measurementDateTime": measurementDateTime
170
+ }
171
+ builder.add_measures(measures_data)
172
+
173
+ # Build system section
174
+ system_data = {
175
+ "osystem": osystem,
176
+ "distribution": distribution,
177
+ "distributionVersion": distributionVersion
178
+ }
179
+ builder.add_system(system_data)
180
+
181
+ # Build software section
182
+ software_data = {
183
+ "language": language,
184
+ "version_software": version_software
185
+ }
186
+ builder.add_software(software_data)
187
+
188
+ # Build infrastructure section
189
+ infrastructure_data = {
190
+ "infraType": infraType,
191
+ "cloudProvider": cloudProvider,
192
+ "cloudInstance": cloudInstance,
193
+ "cloudService": cloudService,
194
+ "components": {
195
+ "componentName": componentName,
196
+ "componentType": componentType,
197
+ "nbComponent": nbComponent,
198
+ "memorySize": memorySize,
199
+ "manufacturer": manufacturer_infra,
200
+ "family": family,
201
+ "series": series,
202
+ "share": share
203
+ }
204
+ }
205
+ builder.add_infrastructure(infrastructure_data)
206
+
207
+ # Build environment section
208
+ environment_data = {
209
+ "country": country,
210
+ "latitude": latitude,
211
+ "longitude": longitude,
212
+ "location": location,
213
+ "powerSupplierType": powerSupplierType,
214
+ "powerSource": powerSource,
215
+ "powerSourceCarbonIntensity": powerSourceCarbonIntensity
216
+ }
217
+ builder.add_environment(environment_data)
218
+
219
+ # Add quality
220
+ builder.add_quality(quality)
221
+
222
+ # Build the final report
223
+ report = builder.build()
224
+
225
+ # Validate that the schema follows the BoAmps format
226
+ is_valid, message = validate_boamps_schema(report)
227
+ if not is_valid:
228
+ return message, None, ""
229
+
230
+ # Create and save the JSON file
231
+ filename = f"report_{taskStage}_{taskFamily}_{infraType}_{reportId}.json"
232
+ filename = filename.replace(" ", "-")
233
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234
  # Create the JSON string
235
+ json_str = json.dumps(report, indent=4, ensure_ascii=False)
236
+
237
+ # Write JSON to a temporary file with the desired filename
238
+ temp_dir = tempfile.gettempdir()
239
+ temp_path = os.path.join(temp_dir, filename)
240
+ with open(temp_path, "w", encoding="utf-8") as tmp:
241
+ tmp.write(json_str)
242
+
243
+ return message, temp_path, json_str
244
+
245
+ except Exception as e:
246
+ return f"Error generating JSON: {str(e)}", None, ""
src/services/report_builder.py ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ JSON processing utilities for BoAmps report generation.
3
+ Provides optimized functions for data transformation and organization.
4
+ """
5
+
6
+ from typing import Dict, List, Any, Optional
7
+
8
+
9
+ def create_section_dict(data: Dict[str, Any], required_fields: List[str] = None) -> Dict[str, Any]:
10
+ """
11
+ Create a section dictionary, including only non-empty values.
12
+
13
+ Args:
14
+ data: Dictionary of field values
15
+ required_fields: List of fields that should always be included if provided
16
+
17
+ Returns:
18
+ Dictionary with non-empty values only, or empty dict if no meaningful values
19
+ """
20
+ section = {}
21
+ required_fields = required_fields or []
22
+
23
+ for key, value in data.items():
24
+ # Include only if it's a required field with meaningful value, or if it's meaningful
25
+ if key in required_fields and is_meaningful_value(value):
26
+ section[key] = value
27
+ elif key not in required_fields and is_meaningful_value(value):
28
+ section[key] = value
29
+
30
+ return section
31
+
32
+
33
+ def is_meaningful_value(value: Any) -> bool:
34
+ """
35
+ Check if a value is meaningful (not empty, not just whitespace).
36
+
37
+ Args:
38
+ value: Value to check
39
+
40
+ Returns:
41
+ True if the value is meaningful, False otherwise
42
+ """
43
+ if value is None:
44
+ return False
45
+ if isinstance(value, str):
46
+ return value.strip() != ""
47
+ if isinstance(value, (int, float)):
48
+ return True
49
+ if isinstance(value, (list, dict)):
50
+ return len(value) > 0
51
+ return bool(value)
52
+
53
+
54
+ def process_dynamic_component_list(field_data: Dict[str, List[Any]], max_rows: int = 5) -> List[Dict[str, Any]]:
55
+ """
56
+ Process dynamic component data into a list of component dictionaries.
57
+ Optimized version of the original process_component_list function.
58
+
59
+ Args:
60
+ field_data: Dictionary where keys are field names and values are lists of row values
61
+ max_rows: Maximum number of rows to process
62
+
63
+ Returns:
64
+ List of component dictionaries
65
+ """
66
+ components = []
67
+
68
+ # Find the actual number of rows with data
69
+ actual_rows = 0
70
+ for field_values in field_data.values():
71
+ if field_values:
72
+ # Count non-empty values from the end
73
+ for i in range(len(field_values) - 1, -1, -1):
74
+ if is_meaningful_value(field_values[i]):
75
+ actual_rows = max(actual_rows, i + 1)
76
+ break
77
+
78
+ # Create components for rows that have data
79
+ for row_idx in range(min(actual_rows, max_rows)):
80
+ component = {}
81
+
82
+ # Add fields that have meaningful values for this row
83
+ for field_name, field_values in field_data.items():
84
+ if row_idx < len(field_values) and is_meaningful_value(field_values[row_idx]):
85
+ component[field_name] = field_values[row_idx]
86
+
87
+ # Only add component if it has at least one field
88
+ if component:
89
+ components.append(component)
90
+
91
+ return components
92
+
93
+
94
+ def create_publisher_section(data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
95
+ """
96
+ Create publisher section with proper validation.
97
+
98
+ Args:
99
+ data: Dictionary containing all header data
100
+
101
+ Returns:
102
+ Publisher dictionary or None if no publisher data
103
+ """
104
+ publisher_fields = {
105
+ "name": data.get("publisher_name"),
106
+ "division": data.get("publisher_division"),
107
+ "projectName": data.get("publisher_projectName"),
108
+ "confidentialityLevel": data.get("publisher_confidentialityLevel"),
109
+ "publicKey": data.get("publisher_publicKey")
110
+ }
111
+
112
+ publisher = create_section_dict(
113
+ publisher_fields, required_fields=["confidentialityLevel"])
114
+ return publisher if publisher else None
115
+
116
+
117
+ class ReportBuilder:
118
+ """
119
+ Builder class for creating BoAmps reports with optimized data processing.
120
+ """
121
+
122
+ def __init__(self):
123
+ self.report = {}
124
+
125
+ def add_header(self, header_data: Dict[str, Any]) -> 'ReportBuilder':
126
+ """Add header section to the report."""
127
+ header_fields = {
128
+ "licensing": header_data.get("licensing"),
129
+ "formatVersion": header_data.get("formatVersion"),
130
+ "formatVersionSpecificationUri": header_data.get("formatVersionSpecificationUri"),
131
+ "reportId": header_data.get("reportId"),
132
+ "reportDatetime": header_data.get("reportDatetime"),
133
+ "reportStatus": header_data.get("reportStatus")
134
+ }
135
+
136
+ header = create_section_dict(header_fields, required_fields=[
137
+ "reportId", "reportDatetime"])
138
+
139
+ # Add publisher if available
140
+ publisher = create_publisher_section(header_data)
141
+ if publisher:
142
+ header["publisher"] = publisher
143
+
144
+ if header:
145
+ self.report["header"] = header
146
+
147
+ return self
148
+
149
+ def add_task(self, task_data: Dict[str, Any]) -> 'ReportBuilder':
150
+ """Add task section to the report."""
151
+ task = {}
152
+
153
+ # Simple task fields
154
+ simple_fields = {
155
+ "taskStage": task_data.get("taskStage"),
156
+ "taskFamily": task_data.get("taskFamily"),
157
+ "nbRequest": task_data.get("nbRequest"),
158
+ "measuredAccuracy": task_data.get("measuredAccuracy"),
159
+ "estimatedAccuracy": task_data.get("estimatedAccuracy"),
160
+ "taskDescription": task_data.get("taskDescription")
161
+ }
162
+
163
+ task.update(create_section_dict(simple_fields,
164
+ required_fields=["taskStage", "taskFamily"]))
165
+
166
+ # Process algorithms
167
+ if "algorithms" in task_data:
168
+ algorithms = process_dynamic_component_list(
169
+ task_data["algorithms"])
170
+ if algorithms:
171
+ task["algorithms"] = algorithms
172
+
173
+ # Process dataset
174
+ if "dataset" in task_data:
175
+ dataset = process_dynamic_component_list(task_data["dataset"])
176
+ if dataset:
177
+ task["dataset"] = dataset
178
+
179
+ self.report["task"] = task
180
+ return self
181
+
182
+ def add_measures(self, measures_data: Dict[str, List[Any]]) -> 'ReportBuilder':
183
+ """Add measures section to the report."""
184
+ measures = process_dynamic_component_list(measures_data)
185
+ if measures:
186
+ self.report["measures"] = measures
187
+ return self
188
+
189
+ def add_system(self, system_data: Dict[str, Any]) -> 'ReportBuilder':
190
+ """Add system section to the report."""
191
+ system_fields = {
192
+ "os": system_data.get("osystem"),
193
+ "distribution": system_data.get("distribution"),
194
+ "distributionVersion": system_data.get("distributionVersion")
195
+ }
196
+
197
+ system = create_section_dict(system_fields, required_fields=["os"])
198
+ # Only add system section if it has meaningful values
199
+ if system:
200
+ self.report["system"] = system
201
+ return self
202
+
203
+ def add_software(self, software_data: Dict[str, Any]) -> 'ReportBuilder':
204
+ """Add software section to the report."""
205
+ software_fields = {
206
+ "language": software_data.get("language"),
207
+ "version": software_data.get("version_software")
208
+ }
209
+
210
+ software = create_section_dict(
211
+ software_fields, required_fields=["language"])
212
+ # Only add software section if it has meaningful values
213
+ if software:
214
+ self.report["software"] = software
215
+ return self
216
+
217
+ def add_infrastructure(self, infra_data: Dict[str, Any]) -> 'ReportBuilder':
218
+ """Add infrastructure section to the report."""
219
+ infrastructure = {}
220
+
221
+ # Simple infrastructure fields
222
+ simple_fields = {
223
+ "infraType": infra_data.get("infraType"),
224
+ "cloudProvider": infra_data.get("cloudProvider"),
225
+ "cloudInstance": infra_data.get("cloudInstance"),
226
+ "cloudService": infra_data.get("cloudService")
227
+ }
228
+
229
+ # Add simple fields only if they have meaningful values
230
+ simple_infra = create_section_dict(
231
+ simple_fields, required_fields=["infraType"])
232
+ infrastructure.update(simple_infra)
233
+
234
+ # Process components
235
+ if "components" in infra_data:
236
+ components = process_dynamic_component_list(
237
+ infra_data["components"])
238
+ if components:
239
+ infrastructure["components"] = components
240
+
241
+ # Only add infrastructure section if it has meaningful content
242
+ if infrastructure:
243
+ self.report["infrastructure"] = infrastructure
244
+ return self
245
+
246
+ def add_environment(self, env_data: Dict[str, Any]) -> 'ReportBuilder':
247
+ """Add environment section to the report."""
248
+ env_fields = {
249
+ "country": env_data.get("country"),
250
+ "latitude": env_data.get("latitude"),
251
+ "longitude": env_data.get("longitude"),
252
+ "location": env_data.get("location"),
253
+ "powerSupplierType": env_data.get("powerSupplierType"),
254
+ "powerSource": env_data.get("powerSource"),
255
+ "powerSourceCarbonIntensity": env_data.get("powerSourceCarbonIntensity")
256
+ }
257
+
258
+ environment = create_section_dict(
259
+ env_fields, required_fields=["country"])
260
+ # Only add environment section if it has meaningful values
261
+ if environment:
262
+ self.report["environment"] = environment
263
+ return self
264
+
265
+ def add_quality(self, quality_value: Any) -> 'ReportBuilder':
266
+ """Add quality field to the report."""
267
+ if is_meaningful_value(quality_value):
268
+ self.report["quality"] = quality_value
269
+ return self
270
+
271
+ def build(self) -> Dict[str, Any]:
272
+ """Build and return the final report."""
273
+ return self.report
src/services/util.py CHANGED
@@ -2,16 +2,7 @@ import os
2
 
3
  # Hugging Face Configuration
4
  HF_TOKEN = os.environ.get("HF_TOKEN")
5
- DATASET_NAME = "boavizta/BoAmps_data"
6
-
7
- # Form Field Configurations
8
- # not used and verified for now
9
- MANDATORY_SECTIONS = ["task", "measures", "infrastructure"]
10
- OBLIGATORY_FIELDS = [
11
- "taskStage", "taskFamily", "dataUsage", "dataType",
12
- "measurementMethod", "powerConsumption", "infraType", "componentType",
13
- "nbComponent"
14
- ]
15
 
16
  # Dropdown Options
17
  REPORT_STATUS_OPTIONS = ["draft", "final", "corrective", "other"]
 
2
 
3
  # Hugging Face Configuration
4
  HF_TOKEN = os.environ.get("HF_TOKEN")
5
+ DATASET_NAME = "boavizta/open_data_boamps"
 
 
 
 
 
 
 
 
 
6
 
7
  # Dropdown Options
8
  REPORT_STATUS_OPTIONS = ["draft", "final", "corrective", "other"]
src/ui/form_components.py CHANGED
@@ -1,4 +1,6 @@
 
1
  import gradio as gr
 
2
  from src.services.util import (
3
  REPORT_STATUS_OPTIONS, CONFIDENTIALITY_LEVELS, DATA_USAGE_OPTIONS, DATA_FORMAT,
4
  DATA_TYPES, DATA_SOURCE,
@@ -9,126 +11,150 @@ from src.services.util import (
9
 
10
  def create_dynamic_section(section_name, fields_config, initial_count=1, layout="row"):
11
  """
12
- Creates a dynamic section in a Gradio interface where users can add or remove rows of input fields.
 
13
 
14
  Args:
15
  section_name (str): The name of the section (e.g., "Algorithms", "Components").
16
- fields_config (list): A list of dictionaries defining the configuration for each field in the section.
17
- Each dictionary should include:
18
- - "type": The Gradio component type (e.g., gr.Textbox, gr.Number).
19
- - "label": The label for the field.
20
- - "info": Additional information or tooltip for the field.
21
- - "value" (optional): The default value for the field.
22
- - "kwargs" (optional): Additional keyword arguments for the component.
23
- - "elem_classes" (optional): CSS classes for styling the field.
24
- initial_count (int): The initial number of rows to render in the section.
25
  layout (str): The layout of the fields in each row ("row" or "column").
26
 
27
  Returns:
28
- tuple: A tuple containing:
29
- - count_state: A Gradio state object tracking the number of rows.
30
- - field_states: A list of Gradio state objects, one for each field, to store the values of the fields.
31
- - add_btn: The "Add" button component for adding new rows.
32
  """
33
- # State management
34
- # Tracks the number of rows in the section.
35
- count_state = gr.State(value=initial_count+1)
36
- # Stores the values for each field across all rows.
37
  field_states = [gr.State([]) for _ in fields_config]
38
- # A list to store all dynamically generated components.
 
 
 
 
 
39
  all_components = []
 
40
 
41
- def update_fields(*states_and_values):
42
- """
43
- Updates the state of the fields when a value changes.
44
-
45
- Args:
46
- *states_and_values: A combination of the current states and the new values for the fields.
47
-
48
- Returns:
49
- tuple: Updated states for all fields.
50
- """
51
- # Split states and current values
52
- # Extract the current states for each field.
53
- states = list(states_and_values[:len(fields_config)])
54
- # Extract the new values for the fields.
55
- current_values = states_and_values[len(fields_config):-1]
56
- index = states_and_values[-1] # The index of the row being updated.
57
-
58
- # Update each field's state
59
- for field_idx, (state, value) in enumerate(zip(states, current_values)):
60
- # Ensure the state list is long enough to accommodate the current index.
61
- while len(state) <= index:
62
- state.append("")
63
- # Update the value at the specified index.
64
- state[index] = value if value is not None else ""
65
-
66
- return tuple(states)
67
-
68
- @gr.render(inputs=count_state)
69
- def render_dynamic_section(count):
70
- """
71
- Renders the dynamic section with the current number of rows and their states.
72
-
73
- Args:
74
- count (int): The number of rows to render.
75
-
76
- Returns:
77
- list: A list of dynamically generated components for the section.
78
- """
79
- nonlocal all_components
80
- all_components = [] # Reset the list of components for re-rendering.
81
-
82
- for i in range(count):
83
- # Create a row or column layout for the current row of fields.
84
  with (gr.Row() if layout == "row" else gr.Column()):
85
- row_components = [] # Components for the current row.
86
- field_refs = [] # References to the current row's components.
87
 
88
  for field_idx, config in enumerate(fields_config):
89
- # Create a component for the field using its configuration.
90
  component = config["type"](
91
- label=f"{config['label']} ({section_name}{i + 1})",
92
  info=config.get("info", ""),
93
  value=config.get("value", ""),
94
  **config.get("kwargs", {}),
95
  elem_classes=config.get("elem_classes", "")
96
  )
97
  row_components.append(component)
98
- field_refs.append(component)
99
 
100
- # Create a change event to update the field states when the value changes.
101
- component.change(
102
- fn=update_fields,
103
- inputs=[*field_states, *field_refs, gr.State(i)],
104
- outputs=field_states
105
- )
106
 
107
- # Add a "Remove" button to delete the current row.
108
- remove_btn = gr.Button("❌", variant="secondary")
109
- remove_btn.click(
110
- lambda x, idx=i, fs=field_states: (
111
- max(0, x - 1), # Decrease the count of rows.
112
- # Remove the row's values.
113
- *[fs[i].value[:idx] + fs[i].value[idx + 1:] for i in range(len(fs))]
114
- ),
115
- inputs=count_state,
116
- outputs=[count_state, *field_states]
117
- )
118
  row_components.append(remove_btn)
119
 
120
- # Add the row's components to the list of all components.
121
- all_components.extend(row_components)
122
- return all_components
123
 
124
- # Initialize the section with the initial count of rows.
125
- render_dynamic_section(count=initial_count)
126
 
127
- # Create an "Add" button to add new rows to the section.
128
  add_btn = gr.Button(f"Add {section_name}")
129
- add_btn.click(lambda x: x + 1, count_state, count_state)
130
 
131
- return (count_state, *field_states, add_btn)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
 
133
 
134
  def create_header_tab():
@@ -141,9 +167,13 @@ def create_header_tab():
141
  formatVersionSpecificationUri = gr.Textbox(
142
  label="Format Version Specification URI", info="(the URI of the present specification of this set of schemas)")
143
  reportId = gr.Textbox(
144
- label="Report ID", info="(the unique identifier of this report, preferably as a uuid4 string)")
145
  reportDatetime = gr.Textbox(
146
- label="Report Datetime", info="Required field<br>(the publishing date of this report in format YYYY-MM-DD HH:MM:SS)", elem_classes="mandatory_field")
 
 
 
 
147
  reportStatus = gr.Dropdown(value=None,
148
  label="Report Status",
149
  choices=REPORT_STATUS_OPTIONS,
@@ -259,7 +289,7 @@ def create_task_tab():
259
  "info": "(the type of quantization used : fp32, fp16, b16, int8 ...)",
260
  }
261
  ],
262
- initial_count=0,
263
  layout="column"
264
  )
265
 
@@ -323,7 +353,7 @@ def create_task_tab():
323
  "info": "(the owner of the dataset if available)",
324
  }
325
  ],
326
- initial_count=0,
327
  layout="column"
328
  )
329
 
@@ -421,7 +451,7 @@ def create_measures_tab():
421
  "info": "(the date when the measurement began, in format YYYY-MM-DD HH:MM:SS)",
422
  }
423
  ],
424
- initial_count=0,
425
  layout="column"
426
  )
427
 
@@ -520,7 +550,7 @@ def create_infrastructure_tab():
520
  "info": "(the percentage of the physical equipment used by the task, this sharing property should be set to 1 by default (if no share) and otherwise to the correct percentage, e.g. 0.5 if you share half-time.)",
521
  }
522
  ],
523
- initial_count=0,
524
  layout="column"
525
  )
526
 
 
1
+ import uuid
2
  import gradio as gr
3
+ import datetime
4
  from src.services.util import (
5
  REPORT_STATUS_OPTIONS, CONFIDENTIALITY_LEVELS, DATA_USAGE_OPTIONS, DATA_FORMAT,
6
  DATA_TYPES, DATA_SOURCE,
 
11
 
12
  def create_dynamic_section(section_name, fields_config, initial_count=1, layout="row"):
13
  """
14
+ Creates a simplified dynamic section with a fixed number of pre-created rows.
15
+ This approach prioritizes data preservation over true dynamic functionality.
16
 
17
  Args:
18
  section_name (str): The name of the section (e.g., "Algorithms", "Components").
19
+ fields_config (list): A list of dictionaries defining the configuration for each field.
20
+ initial_count (int): The initial number of rows to show (up to MAX_ROWS).
 
 
 
 
 
 
 
21
  layout (str): The layout of the fields in each row ("row" or "column").
22
 
23
  Returns:
24
+ tuple: A tuple containing states and the add button, compatible with existing code.
 
 
 
25
  """
26
+ # Fixed number of rows - simple but reliable approach
27
+ MAX_ROWS = 5
28
+
29
+ # Create field states for compatibility with existing code
30
  field_states = [gr.State([]) for _ in fields_config]
31
+
32
+ # Initialize field states with empty values for all possible rows
33
+ for field_state in field_states:
34
+ field_state.value = [""] * MAX_ROWS
35
+
36
+ # Create all rows upfront (some hidden initially)
37
  all_components = []
38
+ all_field_components = [] # Store all field components for event binding
39
 
40
+ for row_idx in range(MAX_ROWS):
41
+ # Use accordion instead of Group for better visibility control
42
+ # Show only initial_count rows at the beginning
43
+ is_visible = row_idx < initial_count
44
+
45
+ # Use accordion that's open for visible rows
46
+ with gr.Accordion(f"{section_name} {row_idx + 1}", open=is_visible, visible=is_visible) as group:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  with (gr.Row() if layout == "row" else gr.Column()):
48
+ row_components = []
 
49
 
50
  for field_idx, config in enumerate(fields_config):
51
+ # Create component
52
  component = config["type"](
53
+ label=f"{config['label']} ({section_name} {row_idx + 1})",
54
  info=config.get("info", ""),
55
  value=config.get("value", ""),
56
  **config.get("kwargs", {}),
57
  elem_classes=config.get("elem_classes", "")
58
  )
59
  row_components.append(component)
 
60
 
61
+ # Store component and indices for later event binding
62
+ all_field_components.append(
63
+ (component, field_idx, row_idx))
 
 
 
64
 
65
+ # Add remove button for this row
66
+ remove_btn = gr.Button(
67
+ "❌ Remove", variant="secondary", size="sm", visible=True)
 
 
 
 
 
 
 
 
68
  row_components.append(remove_btn)
69
 
70
+ all_components.append((group, row_components))
 
 
71
 
72
+ # Visibility state
73
+ visible_count = gr.State(initial_count)
74
 
75
+ # Add button
76
  add_btn = gr.Button(f"Add {section_name}")
 
77
 
78
+ def handle_add(current_count):
79
+ """Show one more row if available"""
80
+ new_count = min(current_count + 1, MAX_ROWS)
81
+
82
+ # Update visibility for all groups
83
+ visibility_updates = []
84
+ for i in range(MAX_ROWS):
85
+ # For accordion, we need to update both visible and open states
86
+ visibility_updates.append(
87
+ gr.update(visible=(i < new_count), open=(i < new_count)))
88
+
89
+ return new_count, *visibility_updates
90
+
91
+ def handle_remove(current_count):
92
+ """Hide the last visible row"""
93
+ new_count = max(current_count - 1,
94
+ 1) # Always keep at least 1 row visible
95
+
96
+ # Update visibility for all groups
97
+ visibility_updates = []
98
+ for i in range(MAX_ROWS):
99
+ # For accordion, we need to update both visible and open states
100
+ visibility_updates.append(
101
+ gr.update(visible=(i < new_count), open=(i < new_count)))
102
+
103
+ return new_count, *visibility_updates
104
+
105
+ # Connect add button
106
+ group_outputs = [group for group, _ in all_components]
107
+ add_btn.click(
108
+ fn=handle_add,
109
+ inputs=[visible_count],
110
+ outputs=[visible_count] + group_outputs
111
+ )
112
+
113
+ # Connect remove buttons for each row
114
+ for row_idx, (group, row_components) in enumerate(all_components):
115
+ remove_btn = row_components[-1] # Remove button is the last component
116
+ remove_btn.click(
117
+ fn=handle_remove,
118
+ inputs=[visible_count],
119
+ outputs=[visible_count] + group_outputs
120
+ )
121
+
122
+ # Force initial visibility on interface load
123
+ def force_initial_visibility():
124
+ """Force initial visibility when the interface loads"""
125
+ visibility_updates = []
126
+ for i in range(MAX_ROWS):
127
+ visibility_updates.append(
128
+ gr.update(visible=(i < initial_count), open=(i < initial_count)))
129
+ return visibility_updates
130
+
131
+ # Create a simple info display
132
+ info_display = gr.Markdown(f"**{section_name}** (Max {MAX_ROWS} items)")
133
+
134
+ # Dummy count state for compatibility
135
+ count_state = gr.State(initial_count)
136
+
137
+ # Apply initial visibility immediately after component creation
138
+ if initial_count > 0:
139
+ # Use app load event to ensure visibility
140
+ for i, (group, _) in enumerate(all_components):
141
+ if i < initial_count:
142
+ group.visible = True
143
+ group.open = True
144
+
145
+ # Store the actual components to return instead of gr.State
146
+ components_to_return = []
147
+ for field_idx in range(len(fields_config)):
148
+ field_components = []
149
+ for row_idx in range(MAX_ROWS):
150
+ # Find the component for this field and row
151
+ for component, f_idx, r_idx in all_field_components:
152
+ if f_idx == field_idx and r_idx == row_idx:
153
+ field_components.append(component)
154
+ break
155
+ components_to_return.append(field_components)
156
+
157
+ return (count_state, *components_to_return, add_btn)
158
 
159
 
160
  def create_header_tab():
 
167
  formatVersionSpecificationUri = gr.Textbox(
168
  label="Format Version Specification URI", info="(the URI of the present specification of this set of schemas)")
169
  reportId = gr.Textbox(
170
+ label="Report ID", info="(the unique identifier of this report, preferably as a uuid4 string)", value=str(uuid.uuid4()))
171
  reportDatetime = gr.Textbox(
172
+ label="Report Datetime",
173
+ info="Required field<br>(the publishing date of this report in format YYYY-MM-DD HH:MM:SS)",
174
+ elem_classes="mandatory_field",
175
+ value=datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
176
+ )
177
  reportStatus = gr.Dropdown(value=None,
178
  label="Report Status",
179
  choices=REPORT_STATUS_OPTIONS,
 
289
  "info": "(the type of quantization used : fp32, fp16, b16, int8 ...)",
290
  }
291
  ],
292
+ initial_count=1,
293
  layout="column"
294
  )
295
 
 
353
  "info": "(the owner of the dataset if available)",
354
  }
355
  ],
356
+ initial_count=1,
357
  layout="column"
358
  )
359
 
 
451
  "info": "(the date when the measurement began, in format YYYY-MM-DD HH:MM:SS)",
452
  }
453
  ],
454
+ initial_count=1,
455
  layout="column"
456
  )
457
 
 
550
  "info": "(the percentage of the physical equipment used by the task, this sharing property should be set to 1 by default (if no share) and otherwise to the correct percentage, e.g. 0.5 if you share half-time.)",
551
  }
552
  ],
553
+ initial_count=1,
554
  layout="column"
555
  )
556