Screen-VLA / README.md
Gemini
VLA Data Generator - Complete TypeScript/React app with backend
256cef9
metadata
title: VLA Data Generator
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Generate VLA training data from videos using AI

VLA Data Generator

A TypeScript/React application for generating vision-language-action (VLA) training data using Google's Gemini AI. This app allows users to upload videos and generate corresponding action sequences and descriptions for training VLA models.

Technology Stack

  • TypeScript - Type-safe JavaScript development
  • React 19 - Modern React with latest features
  • Vite - Fast build tool and development server
  • Google Gemini AI - AI-powered video analysis and action generation

Run Locally

Prerequisites: Node.js (v20 or higher)

  1. Install dependencies:

    npm install
    
  2. Set up environment variables: Create a .env.local file and add your Gemini API key:

    GEMINI_API_KEY=your_gemini_api_key_here
    
  3. Run the development server:

    npm run dev
    
  4. Build for production:

    npm run build
    

Deploy to Hugging Face Spaces

This application can be deployed to Hugging Face Spaces using Docker.

Using the Dockerfile

  1. Ensure your GEMINI_API_KEY is set as a secret in your Hugging Face Space settings
  2. The included Dockerfile will handle the build and deployment process
  3. The app will be accessible on port 7860 (Hugging Face Spaces default)

Manual Deployment Steps

  1. Fork or upload this repository to Hugging Face Spaces
  2. Select "Docker" as the SDK
  3. Add your GEMINI_API_KEY as a secret in the Space settings
  4. The Space will automatically build and deploy using the provided Dockerfile