Spaces:
Sleeping
Sleeping
metadata
title: VLA Data Generator
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Generate VLA training data from videos using AI
VLA Data Generator
A TypeScript/React application for generating vision-language-action (VLA) training data using Google's Gemini AI. This app allows users to upload videos and generate corresponding action sequences and descriptions for training VLA models.
Technology Stack
- TypeScript - Type-safe JavaScript development
- React 19 - Modern React with latest features
- Vite - Fast build tool and development server
- Google Gemini AI - AI-powered video analysis and action generation
Run Locally
Prerequisites: Node.js (v20 or higher)
Install dependencies:
npm install
Set up environment variables: Create a
.env.local
file and add your Gemini API key:GEMINI_API_KEY=your_gemini_api_key_here
Run the development server:
npm run dev
Build for production:
npm run build
Deploy to Hugging Face Spaces
This application can be deployed to Hugging Face Spaces using Docker.
Using the Dockerfile
- Ensure your
GEMINI_API_KEY
is set as a secret in your Hugging Face Space settings - The included Dockerfile will handle the build and deployment process
- The app will be accessible on port 7860 (Hugging Face Spaces default)
Manual Deployment Steps
- Fork or upload this repository to Hugging Face Spaces
- Select "Docker" as the SDK
- Add your
GEMINI_API_KEY
as a secret in the Space settings - The Space will automatically build and deploy using the provided Dockerfile