πΌοΈ Image Captioning Model
This is a deep learning-based image captioning model trained using a CNN Encoder + LSTM Decoder architecture. The model generates captions for input images based on visual features extracted by a Convolutional Neural Network (CNN).
π Model Details
- Model Type: Image Captioning
- Architecture: CNN Encoder + LSTM Decoder
- Framework: PyTorch
- Input: Image (
.jpg
,.png
, etc.) - Output: Generated caption (text)
- Vocabulary: Pre-trained vocabulary file
π How to Use
1οΈβ£ Install Dependencies
pip install torch torchvision transformers huggingface_hub pickle5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The model has no library tag.