SmolLM2 - Sentence Completeness Detection

This model is a fine-tuning of SmolLM2 on a custom sentence completeness classification task.
The goal of the model is to determine whether a given sentence is complete or has been cut off mid-thought β€” for example, when a user accidentally presses enter while still typing.


Use Case

This model is ideal for applications where users enter free-form text and you want to:

  • Detect accidental sentence cutoffs
  • Improve UX in chatbots, note-taking apps, writing assistants, etc.
  • Prompt users to finish their thought when necessary

Task Definition

Given a single sentence, the model predicts:

  • label=1 β†’ The sentence is complete
  • label=0 β†’ The sentence is incomplete / cut off

Example Inputs

Sentence Prediction
"I think we should meet tomorrow around 3pm at" Incomplete (label=0)
"Let's catch up after the meeting tomorrow." Complete (label=1)
"The best thing about this project is" Incomplete (label=0)

How to Use

# pip install transformers torch lightning peft torchmetrics
from huggingface_hub import hf_hub_download
from torch import no_grad, sigmoid
import importlib.util
import sys

repo_id = "ZivK/smollm2-end-of-sentence"
model_name = "token_model.ckpt"
model_src_name = "model.py"
checkpoint_path = hf_hub_download(repo_id=repo_id, filename=model_name)
model_src_path = hf_hub_download(repo_id=repo_id, filename=model_src_name)

# Load the model source code, you can also just download model.py
spec = importlib.util.spec_from_file_location("SmolLM", model_src_path)
smollm_model = importlib.util.module_from_spec(spec)
sys.modules["smollm_model"] = smollm_model
spec.loader.exec_module(smollm_model)

device = "cuda" # for GPU usage or "cpu" for CPU usage
label_map = {0: "Incomplete", 1: "Complete"}
model = smollm_model.SmolLM.load_from_checkpoint(checkpoint_path).to(device)
inputs = model.tokenizer("Gravity is", return_tensors="pt").to(device)
model.eval()
with no_grad():
    logits = model(inputs)
    probs = sigmoid(logits)
    prediction = (probs > 0.5).int().item()
    label = label_map[prediction]
    conf = probs.item() if probs.item() > 0.5 else 1 - probs.item()
print(f"Sentence is {label}, Confidence: {conf*100}%")

Tip: Use the result to prompt users with "Do you want to finish your sentence?" or similar UX.


Model Details

  • Base model: SmolLM2 (360M)
  • Fine-tuned on: A custom binary classification dataset of complete/incomplete sentences
  • Labels:
    • LABEL_0: Incomplete
    • LABEL_1: Complete

Limitations

  • Will assume you will not write an incomplete sentence with a dot or question mark in the end, as it will probably classify it as complete even if it's not complete.
  • Same limitation as per SmolLM2's model card:
    SmolLM2 models primarily understand and generate content in English. They can produce text on a variety of topics, but the generated content may not always be factually accurate, logically consistent, or free from biases present in the training data. These models should be used as assistive tools rather than definitive sources of information. Users should always verify important information and critically evaluate any generated content.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ZivK/smollm2-end-of-sentence

Finetuned
(39)
this model

Dataset used to train ZivK/smollm2-end-of-sentence

Space using ZivK/smollm2-end-of-sentence 1