mineru / README.md
Marcos
Deploy Virtual Girlfriend backend v1
1fdff05
metadata
title: Talking Head Backend
emoji: 🗣️
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860

Talking Head Backend

This Space hosts the backend for the Talking Head application. The frontend for this application can be found here. (Please update if this link is not for your specific frontend).

Setup for Hugging Face Space

To run this backend successfully on Hugging Face Spaces, you need to configure a few things:

  1. API Keys: This backend requires API keys for OpenAI and ElevenLabs. These must be set as Secrets in your Hugging Face Space settings. Navigate to your Space > Settings > Repository secrets (scroll down) and add the following secrets:

    • OPENAI_API_KEY: Your OpenAI API key.
    • ELEVENLABS_API_KEY: Your ElevenLabs API key.

    The application reads these from environment variables.

  2. Rhubarb Lip Sync: The application uses Rhubarb Lip Sync for generating lip sync data. Ensure the rhubarb executable is present in the bin/ directory of this repository. The Dockerfile copies the contents of the backend/bin/ directory, so if you placed rhubarb in /Users/marcos/Documents/projects/talkinghead/backend/bin/rhubarb before running the copy commands, it should be included in the Docker image at /home/node/app/bin/rhubarb.

    If you haven't already, download the Rhubarb-Lip-Sync binary for your OS (likely Linux for the Space environment) from here and place it into mineru_space/backend/bin/. You might need to re-copy or ensure your local git commit includes this binary in the correct location. For a typical Linux x86-64 environment on Spaces, you'd want the corresponding Linux binary.

Local Development (Reminder from original backend/README.md)

For local development, remember to:

  1. Create a .env file in the backend sub-directory with your OPENAI_API_KEY and ELEVENLABS_API_KEY.
  2. Place the Rhubarb binary in backend/bin/.
  3. Run yarn install and yarn dev in the backend sub-directory.

This Space is configured to use the PORT environment variable, defaulting to 7860. Your index.js should respect process.env.PORT.

Endpoints

  • /chat: Handles text-based chat interactions.
  • /voice-chat: Handles voice-based chat interactions.
  • /voices: Lists available voices from ElevenLabs.

(You can add more details about your API, how to use it, etc.)

Features

  • Web interface for uploading and converting PDF files
  • API endpoint for programmatic access
  • High-quality PDF extraction with support for tables, formulas, and complex layouts
  • Output in both Markdown and structured JSON formats

API Usage

The service exposes a dedicated API endpoint for programmatic access:

PDF Conversion Endpoint

POST /api/convert

Request:

  • Content-Type: multipart/form-data
  • Body: form field 'file' containing the PDF file

Response:

{
  "success": true,
  "message": "PDF conversion successful",
  "job_id": "uuid",
  "base_filename": "filename",
  "markdown": "# Converted markdown content...",
  "json": { 
    "title": "Document Title",
    "sections": [...]
  },
  "log": "Processing log..."
}

Client Example

A Python client script (api_client.py) is included in this repository for easy integration:

# Example usage
python api_client.py path/to/your/document.pdf --api-url https://marcosremar2-mineru.hf.space

You can also use curl:

curl -X POST -F "file=@path/to/your/document.pdf" https://marcosremar2-mineru.hf.space/api/convert

Web Interface

The Space also provides a web interface where you can:

  • Upload PDF files for conversion
  • View the generated Markdown and JSON
  • Download the converted files
  • View processing logs

Implementation Details

This service uses:

  • MinerU for high-quality PDF extraction
  • Flask web server for the interface and API
  • Docker container for deployment on Hugging Face Spaces

Learn More

For more information about MinerU, visit the MinerU repository. # Last attempt to refresh build: Wed May 7 00:37:41 CEST 2025