Model Card for ldp72/Test-SmolLM-Marcel

This model was finetuned by performing instruct tuning on Telco domain datatsets.

Model Details

Model Description

  • Developed by: Orange
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [More Information Needed]
  • Language(s) (NLP): English
  • License: [More Information Needed]
  • Finetuned from model [optional]: HuggingFaceTB/SmolLM-135M-Instruct
  • Date [optional]: 2025-07-18 09:48:27

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

This model can be used with the transformers library using pipeline abstraction as follows:

import torch
from transformers import pipeline

model_id = "ldp72/Test-SmolLM-Marcel"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are chatbot specialized on Telco domain."},
{"role": "user", "content": "Can you give a sample of your specialized knowledge?"},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

This model was finetuned with Orange internal fine tuning tools with the Docker Image tagged 0.1.1 in the registry and the following configuration file:

data:
dataset_name:
train:
-   path: telco-lm/arxiv-abstract-generation-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-dsp.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-networkengineering.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-security.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-3gpp-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-5gamericas-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-huawei-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-itu-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-mef-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-ngmn-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-rfc-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/teleqna-mcqa-cot-telco-instructions
revision: legacy
-   path: telco-lm/tii-huawei-qa-open-qa-telco-instructions
revision: legacy
validation_abstract_generation:
-   path: telco-lm/arxiv-abstract-generation-telco-instructions
revision: legacy
split: validation
validation_general:
-   path: telco-lm/slim-orca-multi-task-general-instructions
revision: legacy
split: validation
validation_synthetic:
-   path: telco-lm/synthetic-dsp.stackexchange.com-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-security.stackexchange.com-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-networkengineering.stackexchange.com-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-rfc-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-3gpp-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-5gamericas-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-itu-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-mef-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-huawei-multi-task-telco-instructions
revision: legacy
split: validation
-   path: telco-lm/synthetic-technical-ngmn-multi-task-telco-instructions
revision: legacy
split: validation
validation_telco_qa:
-   path: telco-lm/tii-huawei-qa-open-qa-telco-instructions
revision: legacy
split: validation
validation_telco_qcm:
-   path: telco-lm/teleqna-mcqa-cot-telco-instructions
revision: legacy
split: validation
debug: true
implementation_name: instructions
description:
contributors:
-   email: [email protected]
first_name: Loïc
last_name: Fosse
-   email: [email protected]
first_name: Lionel
last_name: Delphin-Poulat
-   email: [email protected]
first_name: Ismaël
last_name: Rousseau
domain: Telco
languages:
- en
model_name: ldp72/Test-SmolLM-Marcel
image:
version: 0.1.1
model:
attn_implementation: flash_attention_2
chat_template_tokenizer: HuggingFaceTB/SmolLM-135M-Instruct
model_name_or_path: HuggingFaceTB/SmolLM-135M-Instruct
trust_remote_code: true
training:
bf16: true
dataloader_num_workers: 4
dataloader_persistent_workers: true
dataloader_pin_memory: true
dataloader_prefetch_factor: 2
deepspeed: /config/zero3.json
disable_tqdm: true
eval_accumulation_steps: 1
eval_steps: 10
eval_strategy: steps
fp16: false
gradient_accumulation_steps: 2
gradient_checkpointing: true
group_by_length: false
learning_rate: 2.0e-05
log_level: debug
logging_dir: /outputs/Telco-SmolLM-135-Instruct-it-non-reg/logs
logging_steps: 10
lr_scheduler_type: cosine
max_grad_norm: 1.0
max_steps: -1
num_train_epochs: 2
optim: paged_adamw_32bit
output_dir: /outputs/Telco-SmolLM-135-Instruct-it-non-reg
per_device_eval_batch_size: 2
per_device_train_batch_size: 2
push_to_hub: false
report_to: tensorboard
save_steps: 0
save_strategy: epoch
save_total_limit: 1
seed: 42
torch_compile: false
training_type: instruct-tuning
use_liger_kernel: false
warmup_ratio: 0.05
weight_decay: 0.1

Training Data

This model was trained on the following datasets:

-   path: telco-lm/arxiv-abstract-generation-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-dsp.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-networkengineering.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-security.stackexchange.com-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-3gpp-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-5gamericas-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-huawei-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-itu-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-mef-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-ngmn-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/synthetic-technical-rfc-multi-task-telco-instructions
revision: legacy
-   path: telco-lm/teleqna-mcqa-cot-telco-instructions
revision: legacy
-   path: telco-lm/tii-huawei-qa-open-qa-telco-instructions
revision: legacy

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: This model was trained with the following hyperparameters for SFTTrainer,other parameters were set as default:
bf16: true
dataloader_num_workers: 4
dataloader_persistent_workers: true
dataloader_pin_memory: true
dataloader_prefetch_factor: 2
deepspeed: /config/zero3.json
disable_tqdm: true
eval_accumulation_steps: 1
eval_steps: 10
eval_strategy: steps
fp16: false
gradient_accumulation_steps: 2
gradient_checkpointing: true
group_by_length: false
learning_rate: 2.0e-05
log_level: debug
logging_dir: /outputs/Telco-SmolLM-135-Instruct-it-non-reg/logs
logging_steps: 10
lr_scheduler_type: cosine
max_grad_norm: 1.0
max_steps: -1
num_train_epochs: 2
optim: paged_adamw_32bit
output_dir: /outputs/Telco-SmolLM-135-Instruct-it-non-reg
per_device_eval_batch_size: 2
per_device_train_batch_size: 2
push_to_hub: false
report_to: tensorboard
save_steps: 0
save_strategy: epoch
save_total_limit: 1
seed: 42
torch_compile: false
use_liger_kernel: false
warmup_ratio: 0.05
weight_decay: 0.1

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

Thanks to Loïc Fosse, Lionel Delphin-Poulat, Ismaël Rousseau for adding this model.

Downloads last month
7
Safetensors
Model size
135M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ldp72/Test-SmolLM-Marcel

Finetuned
(162)
this model