--- title: VLA Data Generator emoji: 🎬 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false license: mit short_description: Generate VLA training data from videos using AI --- # VLA Data Generator A TypeScript/React application for generating vision-language-action (VLA) training data using Google's Gemini AI. This app allows users to upload videos and generate corresponding action sequences and descriptions for training VLA models. ## Technology Stack - **TypeScript** - Type-safe JavaScript development - **React 19** - Modern React with latest features - **Vite** - Fast build tool and development server - **Google Gemini AI** - AI-powered video analysis and action generation ## Run Locally **Prerequisites:** Node.js (v20 or higher) 1. Install dependencies: ```bash npm install ``` 2. Set up environment variables: Create a `.env.local` file and add your Gemini API key: ``` GEMINI_API_KEY=your_gemini_api_key_here ``` 3. Run the development server: ```bash npm run dev ``` 4. Build for production: ```bash npm run build ``` ## Deploy to Hugging Face Spaces This application can be deployed to Hugging Face Spaces using Docker. ### Using the Dockerfile 1. Ensure your `GEMINI_API_KEY` is set as a secret in your Hugging Face Space settings 2. The included Dockerfile will handle the build and deployment process 3. The app will be accessible on port 7860 (Hugging Face Spaces default) ### Manual Deployment Steps 1. Fork or upload this repository to Hugging Face Spaces 2. Select "Docker" as the SDK 3. Add your `GEMINI_API_KEY` as a secret in the Space settings 4. The Space will automatically build and deploy using the provided Dockerfile