This model provides HuggingFaceTB/SmolVLM-256M-Instruct model in TFLite format.
You can use this model with Custom Cpp Pipiline or run with python pipeline (see COLAB example below).
Please note that, at the moment, AI Edge Torch VLMS not supported
on MediaPipe LLM Inference API,
for example qwen_vl model,
that was used as reference to write SmolVLM-256M-Instruct convertation scripts.
Use the models
Colab
Cpp inference
mkdir cache
bazel run --verbose_failures -c opt //ai_edge_torch/generative/examples/cpp_image:text_generator_main -- \
--tflite_model="/home/dragynir/ai_vlm/ai-edge-torch-smalvlm/ai_edge_torch/generative/examples/smalvlm/models/SmolVLM-256M-Instruct-tflite-single/smalvlm-256m-instruct_q8_ekv2048.tflite" \
--sentencepiece_model="/home/dragynir/ai_vlm/ai-edge-torch-smalvlm/ai_edge_torch/generative/examples/smalvlm/models/SmolVLM-256M-Instruct-tflite/tokenizer.model" \
--start_token="<|im_start|>" --stop_token="<end_of_utterance>" --num_threads=16 \
--prompt="User:<image>What in the image?<end_of_utterance>\nAssistant:" --weight_cache_path="/home/dragynir/llm/ai-edge-torch/ai_edge_torch/generative/examples/cpp/cache/model.xnnpack_cache" \
--use_single_image=true --image_path="/home/dragynir/ai_vlm/car.jpg" --max_generated_tokens=64
TFlite convertation
To fine-tune SmolVLM on a specific task, you can follow the fine-tuning tutorial.
Than, you can convert model to TFlite using custom smalvlm scripts (see Readme.md).
You can also check the official documentation ai-edge-torch generative.
Details
The model was converted with the following parameters:
python convert_to_tflite.py --quantize="dynamic_int8"\
--checkpoint_path='./models/SmolVLM-256M-Instruct' --output_path="./models/SmolVLM-256M-Instruct-tflite"\
--mask_as_input=True --prefill_seq_lens=256 --kv_cache_max_len=2048
- Downloads last month
- 307
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for litert-community/SmolVLM-256M-Instruct
Base model
HuggingFaceTB/SmolLM2-135M
Quantized
HuggingFaceTB/SmolLM2-135M-Instruct
Quantized
HuggingFaceTB/SmolVLM-256M-Instruct