metadata

title: VLA Data Generator
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Generate VLA training data from videos using AI

VLA Data Generator

A TypeScript/React application for generating vision-language-action (VLA) training data using Google's Gemini AI. This app allows users to upload videos and generate corresponding action sequences and descriptions for training VLA models.

Technology Stack

TypeScript - Type-safe JavaScript development
React 19 - Modern React with latest features
Vite - Fast build tool and development server
Google Gemini AI - AI-powered video analysis and action generation

Run Locally

Prerequisites: Node.js (v20 or higher)

Install dependencies:
```
npm install
```
Set up environment variables: Create a .env.local file and add your Gemini API key:
```
GEMINI_API_KEY=your_gemini_api_key_here
```
Run the development server:
```
npm run dev
```
Build for production:
```
npm run build
```

Deploy to Hugging Face Spaces

This application can be deployed to Hugging Face Spaces using Docker.

Using the Dockerfile

Ensure your GEMINI_API_KEY is set as a secret in your Hugging Face Space settings
The included Dockerfile will handle the build and deployment process
The app will be accessible on port 7860 (Hugging Face Spaces default)

Manual Deployment Steps

Fork or upload this repository to Hugging Face Spaces
Select "Docker" as the SDK
Add your GEMINI_API_KEY as a secret in the Space settings
The Space will automatically build and deploy using the provided Dockerfile