singarajusaiteja commited on
Commit
a31d3a4
ยท
verified ยท
1 Parent(s): 84e0286
Files changed (1) hide show
  1. README.md +221 -0
README.md CHANGED
@@ -12,3 +12,224 @@ short_description: AI-powered platform for preserving Indian cultural heritage
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+ # ๐Ÿ‡ฎ๐Ÿ‡ณ Corpus Collection Engine
17
+
18
+ Team Information
19
+ - **Team Name**: Heritage Collectors
20
+ - **Team Members**:
21
+ - Member 1: Singaraju Saiteja (Role: Streamlit app development)
22
+ - Member 2: Muthyapu Sudeepthi (Role: AI Integration)
23
+ - Member 3: Rithika Sadhu (Role: Documentation)
24
+ - Member 4: Golla Bharath Kumar (Role: developement stratergy)
25
+ - Member 5: k. Vamshi Kumar (Role: App design and user experience)
26
+
27
+ **AI-powered platform for preserving Indian cultural heritage through interactive data collection**
28
+
29
+ ## ๐Ÿ“‹ Setup & Installation
30
+
31
+ ### Prerequisites
32
+ - Python 3.8 or higher
33
+ - pip package manager
34
+ - Git (for cloning the repository)
35
+
36
+ ### Quick Start
37
+
38
+ 1. **Clone the Repository**
39
+ ```bash
40
+ git clone [repository-url]
41
+ cd corpus-collection-engine
42
+ ```
43
+
44
+ 2. **Create Virtual Environment**
45
+ ```bash
46
+ python -m venv venv
47
+
48
+ # On Windows
49
+ venv\Scripts\activate
50
+
51
+ # On macOS/Linux
52
+ source venv/bin/activate
53
+ ```
54
+
55
+ 3. **Install Dependencies**
56
+ ```bash
57
+ pip install -r requirements.txt
58
+ ```
59
+
60
+ 4. **Run the Application**
61
+ ```bash
62
+ streamlit run corpus_collection_engine/main.py
63
+ ```
64
+
65
+ 5. **Access the App**
66
+ Open your browser and navigate to localhost:8501
67
+
68
+ ### Alternative Installation Methods
69
+
70
+ #### Using Docker
71
+ ```bash
72
+ docker build -t corpus-collection-engine .
73
+ docker run -p 8501:8501 corpus-collection-engine
74
+ ```
75
+
76
+ #### Using the Smart Installer
77
+ ```bash
78
+ python install_dependencies.py
79
+ python start_app.py
80
+ ```
81
+
82
+ ## ๐ŸŒŸ What is this?
83
+
84
+ The Corpus Collection Engine is an innovative Streamlit application designed to collect and preserve diverse data about Indian languages, history, and culture. Through engaging activities, users contribute to building culturally-aware AI systems while helping preserve India's rich heritage.
85
+
86
+ ## ๐ŸŽฏ Features
87
+
88
+ ### ๐ŸŽญ Interactive Cultural Activities
89
+ - **Meme Creator**: Generate culturally relevant memes in Indian languages
90
+ - **Recipe Collector**: Share traditional recipes with cultural context
91
+ - **Folklore Archive**: Preserve stories, legends, and oral traditions
92
+ - **Landmark Identifier**: Document historical and cultural landmarks
93
+
94
+ ### ๐ŸŒ Multi-language Support
95
+ - Hindi, Bengali, Tamil, Telugu, Marathi, Gujarati, Kannada, Malayalam, Punjabi, Odia, Assamese
96
+ - Native script support and cultural context preservation
97
+
98
+ ### ๐Ÿ“Š Real-time Analytics
99
+ - Contribution tracking and cultural impact metrics
100
+ - Language diversity and regional distribution analysis
101
+ - User engagement and platform growth insights
102
+
103
+ ### ๐Ÿ”’ Privacy-First Design
104
+ - No authentication required - start contributing immediately
105
+ - Minimal data collection with full transparency
106
+ - User-controlled privacy settings
107
+
108
+ ## ๐Ÿš€ How to Use
109
+
110
+ 1. **Choose an Activity**: Select from meme creation, recipe sharing, folklore collection, or landmark documentation
111
+ 2. **Select Your Language**: Pick from 11 supported Indian languages
112
+ 3. **Contribute Content**: Share your cultural knowledge and creativity
113
+ 4. **Add Context**: Provide cultural significance and regional information
114
+ 5. **Submit**: Your contribution helps build culturally-aware AI!
115
+
116
+ ## ๐ŸŽจ Activities Overview
117
+
118
+ ### ๐ŸŽญ Meme Creator
119
+ Create humorous content that reflects Indian culture, festivals, traditions, and daily life. Perfect for capturing contemporary cultural expressions.
120
+
121
+ ### ๐Ÿ› Recipe Collector
122
+ Share traditional family recipes, regional specialties, and festival foods. Include cultural significance, occasions, and regional variations.
123
+
124
+ ### ๐Ÿ“š Folklore Archive
125
+ Preserve oral traditions, folk tales, legends, and cultural stories. Help maintain the rich narrative heritage of India.
126
+
127
+ ### ๐Ÿ›๏ธ Landmark Identifier
128
+ Document historical sites, cultural landmarks, and places of significance. Share stories and cultural importance of locations.
129
+
130
+ ## ๐Ÿ› ๏ธ Technical Architecture
131
+
132
+ ### Built With
133
+ - **Frontend**: Streamlit with custom components
134
+ - **Backend**: Python with modular service architecture
135
+ - **AI Integration**: Fallback text generation for public deployment
136
+ - **Storage**: SQLite for local development, extensible for production
137
+ - **Analytics**: Real-time metrics and reporting
138
+ - **PWA**: Progressive Web App features for offline access
139
+
140
+ ### Project Structure
141
+ ```
142
+ corpus_collection_engine/
143
+ โ”œโ”€โ”€ main.py # Application entry point
144
+ โ”œโ”€โ”€ config.py # Configuration settings
145
+ โ”œโ”€โ”€ activities/ # Activity implementations
146
+ โ”‚ โ”œโ”€โ”€ meme_creator.py
147
+ โ”‚ โ”œโ”€โ”€ recipe_collector.py
148
+ โ”‚ โ”œโ”€โ”€ folklore_collector.py
149
+ โ”‚ โ””โ”€โ”€ landmark_identifier.py
150
+ โ”œโ”€โ”€ services/ # Core services
151
+ โ”‚ โ”œโ”€โ”€ ai_service.py
152
+ โ”‚ โ”œโ”€โ”€ analytics_service.py
153
+ โ”‚ โ”œโ”€โ”€ engagement_service.py
154
+ โ”‚ โ””โ”€โ”€ privacy_service.py
155
+ โ”œโ”€โ”€ models/ # Data models
156
+ โ”œโ”€โ”€ utils/ # Utility functions
157
+ โ””โ”€โ”€ pwa/ # Progressive Web App files
158
+ ```
159
+
160
+ ## ๐Ÿงช Testing
161
+
162
+ Run the test suite:
163
+ ```bash
164
+ python -m pytest tests/
165
+ ```
166
+
167
+ Run specific tests:
168
+ ```bash
169
+ python test_app_startup.py
170
+ ```
171
+
172
+ ## ๐Ÿš€ Deployment
173
+
174
+ ### Hugging Face Spaces
175
+ 1. Upload files to your Hugging Face Space
176
+ 2. Use `app.py` as the entry point
177
+ 3. Ensure `requirements.txt` and `.streamlit/config.toml` are included
178
+
179
+ ### Local Production
180
+ ```bash
181
+ streamlit run corpus_collection_engine/main.py --server.port 8501
182
+ ```
183
+
184
+ ## ๐Ÿค Contributing
185
+
186
+ We welcome contributions! Please see CONTRIBUTING.md for guidelines.
187
+
188
+ ## ๐Ÿ“ License
189
+
190
+ This project is licensed under the MIT License - see the LICENSE file for details.
191
+
192
+ ## ๐ŸŒŸ Why Contribute?
193
+
194
+ - **Preserve Culture**: Help maintain India's diverse cultural heritage for future generations
195
+ - **Build Better AI**: Contribute to creating more culturally-aware and inclusive AI systems
196
+ - **Share Knowledge**: Connect with others who value cultural preservation
197
+ - **Make Impact**: See real-time analytics of your cultural preservation impact
198
+
199
+ ## ๐Ÿ“ˆ Platform Impact
200
+
201
+ Track the collective impact of cultural preservation efforts:
202
+ - Total contributions across all languages
203
+ - Geographic distribution of cultural content
204
+ - Language diversity metrics
205
+ - Cultural significance scoring
206
+
207
+ ## ๐Ÿ”ง Development
208
+
209
+ ### Environment Setup
210
+ ```bash
211
+ # Install development dependencies
212
+ pip install -r requirements-dev.txt
213
+
214
+ # Run linting
215
+ flake8 corpus_collection_engine/
216
+
217
+ # Run type checking
218
+ mypy corpus_collection_engine/
219
+ ```
220
+
221
+ ### Configuration
222
+ - Copy `.env.example` to `.env` and configure your settings
223
+ - Modify `corpus_collection_engine/config.py` for application settings
224
+
225
+ ## ๐Ÿ“ž Support
226
+
227
+ - **Issues**: Report bugs and request features via GitHub Issues
228
+ - **Documentation**: Check our comprehensive guides in the docs folder
229
+ - **Community**: Join our discussions via GitHub Discussions
230
+
231
+ ---
232
+
233
+ **Start preserving Indian culture today! ๐Ÿ‡ฎ๐Ÿ‡ณโœจ**
234
+
235
+ *Every contribution matters in building a more culturally-aware digital future.*