Spaces:
Sleeping
title: Talking Head Backend
emoji: 🗣️
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
Talking Head Backend
This Space hosts the backend for the Talking Head application. The frontend for this application can be found here. (Please update if this link is not for your specific frontend).
Setup for Hugging Face Space
To run this backend successfully on Hugging Face Spaces, you need to configure a few things:
API Keys: This backend requires API keys for OpenAI and ElevenLabs. These must be set as Secrets in your Hugging Face Space settings. Navigate to your Space > Settings > Repository secrets (scroll down) and add the following secrets:
OPENAI_API_KEY
: Your OpenAI API key.ELEVENLABS_API_KEY
: Your ElevenLabs API key.
The application reads these from environment variables.
Rhubarb Lip Sync: The application uses Rhubarb Lip Sync for generating lip sync data. Ensure the
rhubarb
executable is present in thebin/
directory of this repository. TheDockerfile
copies the contents of thebackend/bin/
directory, so if you placedrhubarb
in/Users/marcos/Documents/projects/talkinghead/backend/bin/rhubarb
before running the copy commands, it should be included in the Docker image at/home/node/app/bin/rhubarb
.If you haven't already, download the Rhubarb-Lip-Sync binary for your OS (likely Linux for the Space environment) from here and place it into
mineru_space/backend/bin/
. You might need to re-copy or ensure your local git commit includes this binary in the correct location. For a typical Linux x86-64 environment on Spaces, you'd want the corresponding Linux binary.
Local Development (Reminder from original backend/README.md)
For local development, remember to:
- Create a
.env
file in thebackend
sub-directory with yourOPENAI_API_KEY
andELEVENLABS_API_KEY
. - Place the Rhubarb binary in
backend/bin/
. - Run
yarn install
andyarn dev
in thebackend
sub-directory.
This Space is configured to use the PORT
environment variable, defaulting to 7860. Your index.js
should respect process.env.PORT
.
Endpoints
/chat
: Handles text-based chat interactions./voice-chat
: Handles voice-based chat interactions./voices
: Lists available voices from ElevenLabs.
(You can add more details about your API, how to use it, etc.)
Features
- Web interface for uploading and converting PDF files
- API endpoint for programmatic access
- High-quality PDF extraction with support for tables, formulas, and complex layouts
- Output in both Markdown and structured JSON formats
API Usage
The service exposes a dedicated API endpoint for programmatic access:
PDF Conversion Endpoint
POST /api/convert
Request:
- Content-Type: multipart/form-data
- Body: form field 'file' containing the PDF file
Response:
{
"success": true,
"message": "PDF conversion successful",
"job_id": "uuid",
"base_filename": "filename",
"markdown": "# Converted markdown content...",
"json": {
"title": "Document Title",
"sections": [...]
},
"log": "Processing log..."
}
Client Example
A Python client script (api_client.py
) is included in this repository for easy integration:
# Example usage
python api_client.py path/to/your/document.pdf --api-url https://marcosremar2-mineru.hf.space
You can also use curl:
curl -X POST -F "file=@path/to/your/document.pdf" https://marcosremar2-mineru.hf.space/api/convert
Web Interface
The Space also provides a web interface where you can:
- Upload PDF files for conversion
- View the generated Markdown and JSON
- Download the converted files
- View processing logs
Implementation Details
This service uses:
- MinerU for high-quality PDF extraction
- Flask web server for the interface and API
- Docker container for deployment on Hugging Face Spaces
Learn More
For more information about MinerU, visit the MinerU repository. # Last attempt to refresh build: Wed May 7 00:37:41 CEST 2025