You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations

Hugging Face
License

Model Description

OGAI-STEM-7B is a LoRA fine-tuned Mathstral-7B model, designed specifically for oil and gas engineering, scientific computing, and technical problem-solving. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding.

The model is an integral part of GainEnergy's Upstrima AI Platform, enhancing workflows with pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis.

Technical Architecture

Base Model Specifications

  • Architecture: Mathstral-7B (Mistral fine-tuned for advanced math reasoning)
  • Parameters: 7B
  • Context Length: 32,768 tokens for long-form scientific queries
  • Mathematical Precision: Enhanced for oil & gas engineering computations

Fine-tuning Approach

  • Method: Low-Rank Adaptation (LoRA) with rank 64
  • Training Dataset: 3.2M datapoints from specialized oil & gas engineering sources
  • Hardware: Trained on 8x NVIDIA A100 80GB GPUs
  • Training Time: 2,200 GPU hours
  • Special Features: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations

Performance Optimizations

  • Quantization: 4-bit and 8-bit versions optimized for low-memory inference
  • Inference Speed: Tuned KV cache management for real-time engineering computations
  • Memory Footprint: Runs efficiently on 12GB VRAM with 4-bit quantization
  • Reduced Hallucinations: Domain-specific fine-tuning minimizes incorrect scientific results

Deployment-Optimized Versions

Version Memory Requirement Performance
OGAI-STEM-7B-GGUF CPU optimized Suitable for edge computing

Local Deployment with vLLM

python -m vllm.entrypoints.openai.api_server \
  --model GainEnergy/ogai-stem-7b \
  --tensor-parallel-size 2

How to Use

Run Inference in Python

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "GainEnergy/ogai-stem-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citing OGAI-STEM-7B

@article{ogai_stem_7b_2025,
  title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing},
  author={GainEnergy AI Team},
  year={2025},
  publisher={Hugging Face Models}
}
Downloads last month
0
Safetensors
Model size
7.25B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for GainEnergy/OGAI-STEM-7B

Adapter
(3)
this model

Datasets used to train GainEnergy/OGAI-STEM-7B

Collection including GainEnergy/OGAI-STEM-7B

Evaluation results

  • Engineering Calculations Accuracy on GainEnergy Oil & Gas Corpus
    self-reported
    94.500
  • Scientific Computation Precision on GainEnergy Oil & Gas Corpus
    self-reported
    92.300
  • Context Retention on GainEnergy Oil & Gas Corpus
    self-reported
    High