huckiyang commited on
Commit
9e02dae
·
1 Parent(s): c358966

[node] estimation

Browse files
Files changed (2) hide show
  1. README.md +19 -0
  2. app.py +4 -4
README.md CHANGED
@@ -89,6 +89,8 @@ python app.py
89
  | Qwen2-VL-7B | 1024/256 | 1 | Inference | FP16 | 1 |
90
  | VILA-1.5-13B | 2048/512 | 2 | Inference | BF16 | 1 |
91
  | Qwen2-Audio-7B | 1024/256 | 1 | Inference | FP16 | 1 |
 
 
92
 
93
  ## CUDA Recommendations
94
 
@@ -133,6 +135,12 @@ The application provides tailored CUDA version recommendations:
133
  - **Qwen-Audio**: Base, Chat variants
134
  - **Qwen2-Audio**: 7B
135
 
 
 
 
 
 
 
136
  ### Precision Impact
137
  - **FP32**: Full precision (4 bytes per parameter)
138
  - **FP16/BF16**: Half precision (2 bytes per parameter)
@@ -145,6 +153,17 @@ The application provides tailored CUDA version recommendations:
145
  - **Memory Overhead**: Additional memory for vision/audio encoders and cross-modal attention
146
  - **Token Estimation**: Consider multimodal inputs when calculating token counts
147
 
 
 
 
 
 
 
 
 
 
 
 
148
  ## Limitations
149
 
150
  - Estimates are approximate and may vary based on:
 
89
  | Qwen2-VL-7B | 1024/256 | 1 | Inference | FP16 | 1 |
90
  | VILA-1.5-13B | 2048/512 | 2 | Inference | BF16 | 1 |
91
  | Qwen2-Audio-7B | 1024/256 | 1 | Inference | FP16 | 1 |
92
+ | PhysicsNeMo-FNO-Large | 512/128 | 8 | Training | FP32 | 1 |
93
+ | PhysicsNeMo-GraphCast-Medium | 1024/256 | 4 | Training | FP16 | 1 |
94
 
95
  ## CUDA Recommendations
96
 
 
135
  - **Qwen-Audio**: Base, Chat variants
136
  - **Qwen2-Audio**: 7B
137
 
138
+ #### Physics-ML Models (NVIDIA PhysicsNeMo)
139
+ - **Fourier Neural Operators (FNO)**: Small (1M), Medium (10M), Large (50M)
140
+ - **Physics-Informed Neural Networks (PINN)**: Small (0.5M), Medium (5M), Large (20M)
141
+ - **GraphCast**: Small (50M), Medium (200M), Large (1B) - for weather/climate modeling
142
+ - **Spherical FNO (SFNO)**: Small (25M), Medium (100M), Large (500M) - for global simulations
143
+
144
  ### Precision Impact
145
  - **FP32**: Full precision (4 bytes per parameter)
146
  - **FP16/BF16**: Half precision (2 bytes per parameter)
 
153
  - **Memory Overhead**: Additional memory for vision/audio encoders and cross-modal attention
154
  - **Token Estimation**: Consider multimodal inputs when calculating token counts
155
 
156
+ ### PhysicsNeMo Considerations
157
+ - **Grid-Based Data**: Physics models work with spatial/temporal grids rather than text tokens
158
+ - **Batch Training**: Physics-ML models typically require larger batch sizes for stable training
159
+ - **Memory Patterns**: Different from LLMs - less KV cache, more gradient memory for PDE constraints
160
+ - **Precision Requirements**: Many physics simulations require FP32 for numerical stability
161
+ - **Use Cases**:
162
+ - **FNO**: Solving PDEs on regular grids (fluid dynamics, heat transfer)
163
+ - **PINN**: Physics-informed training with PDE constraints
164
+ - **GraphCast**: Weather prediction and climate modeling
165
+ - **SFNO**: Global atmospheric and oceanic simulations
166
+
167
  ## Limitations
168
 
169
  - Estimates are approximate and may vary based on:
app.py CHANGED
@@ -272,10 +272,10 @@ def estimate_nodes_interface(
272
 
273
  # Validate inputs
274
  if input_tokens <= 0 or output_tokens <= 0:
275
- return "Please enter valid token counts (> 0)", "", None, ""
276
 
277
  if batch_size <= 0:
278
- return "Please enter a valid batch size (> 0)", "", None, ""
279
 
280
  # Calculate node requirements
281
  nodes_needed, explanation, breakdown = estimate_h100_nodes(
@@ -288,7 +288,7 @@ def estimate_nodes_interface(
288
  # Create performance chart
289
  fig = create_performance_chart(breakdown)
290
 
291
- return explanation, cuda_rec, fig, f"**Estimated H100 Nodes Required: {nodes_needed}**"
292
 
293
  # Create Gradio interface
294
  def create_interface():
@@ -345,7 +345,7 @@ def create_interface():
345
  with gr.Column(scale=2):
346
  gr.Markdown("## Results")
347
 
348
- node_count = gr.Markdown("**Ready to estimate...**")
349
 
350
  with gr.Tab("📊 Detailed Analysis"):
351
  detailed_output = gr.Markdown()
 
272
 
273
  # Validate inputs
274
  if input_tokens <= 0 or output_tokens <= 0:
275
+ return "Please enter valid token counts (> 0)", "", None, "## ⚠️ <span style='color: #E74C3C;'>**Invalid Input: Token counts must be > 0**</span>"
276
 
277
  if batch_size <= 0:
278
+ return "Please enter a valid batch size (> 0)", "", None, "## ⚠️ <span style='color: #E74C3C;'>**Invalid Input: Batch size must be > 0**</span>"
279
 
280
  # Calculate node requirements
281
  nodes_needed, explanation, breakdown = estimate_h100_nodes(
 
288
  # Create performance chart
289
  fig = create_performance_chart(breakdown)
290
 
291
+ return explanation, cuda_rec, fig, f"## 🖥️ <span style='color: #4A90E2;'>**Estimated H100 Nodes Required: {nodes_needed}**</span>"
292
 
293
  # Create Gradio interface
294
  def create_interface():
 
345
  with gr.Column(scale=2):
346
  gr.Markdown("## Results")
347
 
348
+ node_count = gr.Markdown("## 🖥️ <span style='color: #4A90E2;'>**Ready to estimate...**</span>")
349
 
350
  with gr.Tab("📊 Detailed Analysis"):
351
  detailed_output = gr.Markdown()