πŸ–ΌοΈ Image Captioning Model

This is a deep learning-based image captioning model trained using a CNN Encoder + LSTM Decoder architecture. The model generates captions for input images based on visual features extracted by a Convolutional Neural Network (CNN).

πŸ“Œ Model Details

  • Model Type: Image Captioning
  • Architecture: CNN Encoder + LSTM Decoder
  • Framework: PyTorch
  • Input: Image (.jpg, .png, etc.)
  • Output: Generated caption (text)
  • Vocabulary: Pre-trained vocabulary file

πŸš€ How to Use

1️⃣ Install Dependencies

pip install torch torchvision transformers huggingface_hub pickle5
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support