Update README.md
Browse files
README.md
CHANGED
@@ -12,53 +12,68 @@ pinned: true
|
|
12 |
|
13 |
|
14 |
|
15 |
-
|
16 |
-
ExplainAnything.AI is an interactive multimodal science explainer built with Gradio. It allows users to:
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
21 |
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
|
28 |
-
|
29 |
-
Ask a Question
|
30 |
-
Type a science question (e.g., How do volcanoes erupt?)
|
31 |
-
β Choose level: Kid, Beginner, or Advanced
|
32 |
|
33 |
-
|
34 |
-
You can upload a PDF, image, or diagram to get contextual explanations
|
35 |
|
36 |
-
|
37 |
-
The app generates a clear explanation and a visual illustration using generative models
|
38 |
|
39 |
-
|
40 |
-
Answer auto-generated multiple choice questions to reinforce learning
|
41 |
|
42 |
-
|
43 |
-
Export everything to a PDF or Markdown file for later review
|
44 |
|
45 |
-
|
46 |
-
Gradio (UI)
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
49 |
|
50 |
-
|
51 |
|
52 |
-
|
53 |
|
54 |
-
|
|
|
|
|
55 |
|
56 |
-
|
57 |
|
58 |
-
|
59 |
-
Make sure to add the following API keys as secrets in your Space:
|
60 |
|
|
|
|
|
|
|
61 |
|
62 |
-
GEMINI_API_KEY
|
63 |
-
OPENROUTER_API_KEY
|
64 |
-
HF_TOKEN
|
|
|
12 |
|
13 |
|
14 |
|
15 |
+
# ExplainAnything.AI
|
|
|
16 |
|
17 |
+
**Track:** agent-demo-track
|
18 |
|
19 |
+
**ExplainAnything.AI** is a multimodal agent that helps users understand any science topic β either by **asking a question** or **uploading an image or PDF**. It then builds an interactive explainer package with:
|
20 |
+
- π§ Easy-to-understand explanation (via Mistral)
|
21 |
+
- πΌοΈ Auto-generated visual diagram (via Flux)
|
22 |
+
- β Quiz questions to test understanding (via Mistral)
|
23 |
+
- π Downloadable report summarizing everything
|
24 |
|
25 |
+
---
|
26 |
+
|
27 |
+
## π How It Works
|
28 |
+
|
29 |
+
You have two options to get started:
|
30 |
+
1. **Ask a question**, e.g. *"How do solar panels work?"*
|
31 |
+
2. **Upload an image or PDF**, like a diagram or worksheet.
|
32 |
+
|
33 |
+
The agent then:
|
34 |
+
- Uses **Gemini Vision** (for image/PDF) or **your question** as input
|
35 |
+
- Generates an explanation using **Mistral**
|
36 |
+
- Creates a visual diagram with **Flux**
|
37 |
+
- Generates quiz questions using **Mistral**
|
38 |
+
- Compiles everything into a downloadable **learning report**
|
39 |
+
|
40 |
+
---
|
41 |
|
42 |
+
## π§ Build Status
|
43 |
|
44 |
+
This Space is currently **building**, but all logic, tools, and functionality were submitted **before the deadline**.
|
45 |
|
46 |
+
---
|
|
|
|
|
|
|
47 |
|
48 |
+
## π₯ Video Overview
|
|
|
49 |
|
50 |
+
**Video Overview:** [Coming Soon β will be added post-deadline]
|
|
|
51 |
|
52 |
+
*A short walkthrough of the appβs flow and learning experience will be uploaded here.*
|
|
|
53 |
|
54 |
+
---
|
|
|
55 |
|
56 |
+
## π οΈ Tech Stack
|
|
|
57 |
|
58 |
+
- **Mistral** β for science explanations + quiz generation
|
59 |
+
- **Flux** β for diagram/image generation
|
60 |
+
- **Gemini Vision** β for reading image and PDF content
|
61 |
+
- **Gradio** β chat + upload interface
|
62 |
+
- Manual orchestration (no MCP yet)
|
63 |
|
64 |
+
---
|
65 |
|
66 |
+
## π Use Cases
|
67 |
|
68 |
+
- π§βπ Students learning STEM with visual + interactive help
|
69 |
+
- π©βπ« Teachers turning textbook pages into visual lessons
|
70 |
+
- π§ Self-learners asking "why/how" questions and getting full reports
|
71 |
|
72 |
+
---
|
73 |
|
74 |
+
## π§ Future Plans
|
|
|
75 |
|
76 |
+
- Integrate Hugging Face MCP for agent orchestration
|
77 |
+
- Add TTS narration for accessibility
|
78 |
+
- Generate downloadable PDF learning packs
|
79 |
|
|
|
|
|
|