MonaHamid commited on
Commit
0cafd22
Β·
verified Β·
1 Parent(s): 6d08633

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -31
README.md CHANGED
@@ -12,53 +12,68 @@ pinned: true
12
 
13
 
14
 
15
- πŸ“š ExplainAnything.AI
16
- ExplainAnything.AI is an interactive multimodal science explainer built with Gradio. It allows users to:
17
 
18
- Ask science questions or upload diagrams/PDFs
19
 
20
- Get clear explanations using Google Gemini or Mistral
 
 
 
 
21
 
22
- Generate visual diagrams via FLUX (Stable Diffusion)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- Take auto-generated quizzes to test understanding
25
 
26
- Download personalized science reports in PDF or Markdown
27
 
28
- How It Works
29
- Ask a Question
30
- Type a science question (e.g., How do volcanoes erupt?)
31
- β†’ Choose level: Kid, Beginner, or Advanced
32
 
33
- Upload a File (Optional)
34
- You can upload a PDF, image, or diagram to get contextual explanations
35
 
36
- View Explanation + Diagram
37
- The app generates a clear explanation and a visual illustration using generative models
38
 
39
- Take the Quiz
40
- Answer auto-generated multiple choice questions to reinforce learning
41
 
42
- Download Report
43
- Export everything to a PDF or Markdown file for later review
44
 
45
- πŸ”§ Technologies Used
46
- Gradio (UI)
47
 
48
- Google Gemini API (explanation from images + PDFs)
 
 
 
 
49
 
50
- Mistral via OpenRouter (text-based science explanations)
51
 
52
- Stable Diffusion FLUX (diagram generation)
53
 
54
- PDFPlumber + FPDF (report creation)
 
 
55
 
56
- Hugging Face Spaces (deployment)
57
 
58
- Environment Variables (set under Settings > Secrets)
59
- Make sure to add the following API keys as secrets in your Space:
60
 
 
 
 
61
 
62
- GEMINI_API_KEY
63
- OPENROUTER_API_KEY
64
- HF_TOKEN
 
12
 
13
 
14
 
15
+ # ExplainAnything.AI
 
16
 
17
+ **Track:** agent-demo-track
18
 
19
+ **ExplainAnything.AI** is a multimodal agent that helps users understand any science topic β€” either by **asking a question** or **uploading an image or PDF**. It then builds an interactive explainer package with:
20
+ - 🧠 Easy-to-understand explanation (via Mistral)
21
+ - πŸ–ΌοΈ Auto-generated visual diagram (via Flux)
22
+ - ❓ Quiz questions to test understanding (via Mistral)
23
+ - πŸ“„ Downloadable report summarizing everything
24
 
25
+ ---
26
+
27
+ ## πŸš€ How It Works
28
+
29
+ You have two options to get started:
30
+ 1. **Ask a question**, e.g. *"How do solar panels work?"*
31
+ 2. **Upload an image or PDF**, like a diagram or worksheet.
32
+
33
+ The agent then:
34
+ - Uses **Gemini Vision** (for image/PDF) or **your question** as input
35
+ - Generates an explanation using **Mistral**
36
+ - Creates a visual diagram with **Flux**
37
+ - Generates quiz questions using **Mistral**
38
+ - Compiles everything into a downloadable **learning report**
39
+
40
+ ---
41
 
42
+ ## 🚧 Build Status
43
 
44
+ This Space is currently **building**, but all logic, tools, and functionality were submitted **before the deadline**.
45
 
46
+ ---
 
 
 
47
 
48
+ ## πŸŽ₯ Video Overview
 
49
 
50
+ **Video Overview:** [Coming Soon – will be added post-deadline]
 
51
 
52
+ *A short walkthrough of the app’s flow and learning experience will be uploaded here.*
 
53
 
54
+ ---
 
55
 
56
+ ## πŸ› οΈ Tech Stack
 
57
 
58
+ - **Mistral** – for science explanations + quiz generation
59
+ - **Flux** – for diagram/image generation
60
+ - **Gemini Vision** – for reading image and PDF content
61
+ - **Gradio** – chat + upload interface
62
+ - Manual orchestration (no MCP yet)
63
 
64
+ ---
65
 
66
+ ## πŸ“˜ Use Cases
67
 
68
+ - πŸ§‘β€πŸŽ“ Students learning STEM with visual + interactive help
69
+ - πŸ‘©β€πŸ« Teachers turning textbook pages into visual lessons
70
+ - 🧠 Self-learners asking "why/how" questions and getting full reports
71
 
72
+ ---
73
 
74
+ ## 🧠 Future Plans
 
75
 
76
+ - Integrate Hugging Face MCP for agent orchestration
77
+ - Add TTS narration for accessibility
78
+ - Generate downloadable PDF learning packs
79