---
title: VLA Data Generator
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Generate VLA training data from videos using AI
---

# VLA Data Generator

A TypeScript/React application for generating vision-language-action (VLA) training data using Google's Gemini AI. This app allows users to upload videos and generate corresponding action sequences and descriptions for training VLA models.

## Technology Stack

- **TypeScript** - Type-safe JavaScript development
- **React 19** - Modern React with latest features
- **Vite** - Fast build tool and development server
- **Google Gemini AI** - AI-powered video analysis and action generation

## Run Locally

**Prerequisites:** Node.js (v20 or higher)

1. Install dependencies:
   ```bash
   npm install
   ```

2. Set up environment variables:
   Create a `.env.local` file and add your Gemini API key:
   ```
   GEMINI_API_KEY=your_gemini_api_key_here
   ```

3. Run the development server:
   ```bash
   npm run dev
   ```

4. Build for production:
   ```bash
   npm run build
   ```

## Deploy to Hugging Face Spaces

This application can be deployed to Hugging Face Spaces using Docker. 

### Using the Dockerfile

1. Ensure your `GEMINI_API_KEY` is set as a secret in your Hugging Face Space settings
2. The included Dockerfile will handle the build and deployment process
3. The app will be accessible on port 7860 (Hugging Face Spaces default)

### Manual Deployment Steps

1. Fork or upload this repository to Hugging Face Spaces
2. Select "Docker" as the SDK
3. Add your `GEMINI_API_KEY` as a secret in the Space settings
4. The Space will automatically build and deploy using the provided Dockerfile