Spaces:

wjbmattingly
/

NuMarkdown-8B-Thinking-Demo

Running on Zero

App Files Files Community

NuMarkdown-8B-Thinking-Demo / README.md

William Mattingly

init

f2533ce 3 months ago

preview code

raw

history blame contribute delete

3.81 kB

	---
	title: NuMarkdown 8B Thinking Demo
	emoji: 🤖
	colorFrom: indigo
	colorTo: blue
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	short_description: Demo for NuMarkdown-8B-Thinking with reasoning capabilities
	---

	# 🤖 NuMarkdown-8B Reasoning Demo

	A Gradio-based web application that demonstrates the reasoning capabilities of the NuMarkdown-8B model from NumInd. This app allows users to upload images and see both the model's detailed thinking process and final analysis.

	## 🌟 Features

	- Visual Analysis: Upload any image for AI analysis
	- Reasoning Transparency: See the model's step-by-step thinking process
	- Clean Interface: Side-by-side layout with tall frames for better visibility
	- Zero GPU Integration: Optimized for HuggingFace Spaces with GPU acceleration
	- Real-time Processing: Automatic analysis when images are uploaded

	## 🚀 Quick Start

	### Option 1: Deploy to HuggingFace Spaces (Recommended)

	1. Create a new Space on [HuggingFace Spaces](https://huggingface.co/spaces)
	2. Choose:
	- SDK: `Gradio`
	- Hardware: `Zero GPU` (for best performance)
	3. Upload these files to your Space:
	- `app.py`
	- `requirements.txt`
	- `README.md`
	4. Your Space will automatically build and deploy!

	### Option 2: Run Locally

	```bash
	# Clone this repository
	git clone <your-repo-url>
	cd NuMarkdown-8B-Thinking-Demo

	# Install dependencies
	pip install -r requirements.txt

	# Run the application
	python app.py
	```

	The app will be available at `http://localhost:7860`

	## 🔧 Technical Details

	### Model Information
	- Model: `numind/NuMarkdown-8B-reasoning`
	- Type: Vision-Language Model with reasoning capabilities
	- Framework: Qwen2.5-VL architecture
	- Features: Structured thinking with `<think>` and `<answer>` tags

	### Dependencies
	- Gradio 4.44.0: Web interface framework
	- PyTorch: Deep learning framework
	- Transformers: HuggingFace model library
	- Flash Attention 2: Optimized attention mechanism
	- Spaces: HuggingFace Zero GPU integration

	## 📱 How to Use

	1. Upload an Image: Click on the image upload area on the left side
	2. Wait for Processing: The model will automatically analyze your image
	3. View Results:
	- Reasoning Panel: See the model's detailed thinking process
	- Answer Panel: Get the final conclusion or analysis

	## 🎯 Use Cases

	- Document Analysis: Analyze text, tables, charts, and diagrams
	- Educational Content: Understand complex visual information
	- Research: Extract insights from academic papers and figures
	- General Vision: Describe and analyze any visual content

	## 🛠️ Customization

	### Modify Generation Parameters

	In `app.py`, you can adjust:

	```python
	model_output = model.generate(
	**model_input,
	temperature=0.7, # Creativity level (0.1-1.0)
	max_new_tokens=5000 # Maximum response length
	)
	```

	### UI Customization

	The interface uses custom CSS for tall frames. Modify the `css` parameter in `gr.Blocks()` to adjust the layout.

	## 📊 Performance Notes

	- Zero GPU Spaces: Provides the best performance for this model
	- Memory Requirements: ~16GB VRAM recommended for optimal performance
	- Processing Time: Typically 10-30 seconds depending on image complexity

	## 🤝 Contributing

	Feel free to submit issues and enhancement requests!

	## 📄 License

	This project is open source. Please check the license of the underlying model (`numind/NuMarkdown-8B-reasoning`) for commercial use restrictions.

	## 🙏 Acknowledgments

	- NumInd for the amazing NuMarkdown-8B-reasoning model
	- HuggingFace for the Transformers library and Spaces platform
	- Gradio for the easy-to-use web interface framework

	---

	Built with ❤️ for the AI community