gardarjuto commited on
Commit
1d31670
·
1 Parent(s): 4da0dd8

major refactor

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. DEPLOYMENT.md +66 -0
  2. Dockerfile_ +62 -0
  3. LICENSE +201 -0
  4. README.md +66 -6
  5. backend/app/__init__.py +1 -0
  6. backend/app/__pycache__/__init__.cpython-311.pyc +0 -0
  7. backend/app/__pycache__/__init__.cpython-312.pyc +0 -0
  8. backend/app/__pycache__/asgi.cpython-312.pyc +0 -0
  9. backend/app/api/__init__.py +3 -0
  10. backend/app/api/__pycache__/__init__.cpython-312.pyc +0 -0
  11. backend/app/api/__pycache__/router.cpython-312.pyc +0 -0
  12. backend/app/api/endpoints/__init__.py +3 -0
  13. backend/app/api/endpoints/__pycache__/__init__.cpython-312.pyc +0 -0
  14. backend/app/api/endpoints/__pycache__/leaderboard.cpython-312.pyc +0 -0
  15. backend/app/api/endpoints/leaderboard.py +49 -0
  16. backend/app/api/router.py +7 -0
  17. backend/app/asgi.py +111 -0
  18. backend/app/config/__init__.py +21 -0
  19. backend/app/config/__pycache__/__init__.cpython-311.pyc +0 -0
  20. backend/app/config/__pycache__/__init__.cpython-312.pyc +0 -0
  21. backend/app/config/__pycache__/hf_config.cpython-311.pyc +0 -0
  22. backend/app/config/__pycache__/hf_config.cpython-312.pyc +0 -0
  23. backend/app/config/hf_config.py +41 -0
  24. backend/app/core/__pycache__/cache.cpython-311.pyc +0 -0
  25. backend/app/core/__pycache__/cache.cpython-312.pyc +0 -0
  26. backend/app/core/__pycache__/fastapi_cache.cpython-312.pyc +0 -0
  27. backend/app/core/cache.py +33 -0
  28. backend/app/core/fastapi_cache.py +63 -0
  29. backend/app/services/__init__.py +3 -0
  30. backend/app/services/__pycache__/__init__.cpython-311.pyc +0 -0
  31. backend/app/services/__pycache__/__init__.cpython-312.pyc +0 -0
  32. backend/app/services/__pycache__/leaderboard.cpython-311.pyc +0 -0
  33. backend/app/services/__pycache__/leaderboard.cpython-312.pyc +0 -0
  34. backend/app/services/leaderboard.py +277 -0
  35. backend/original_src/__pycache__/about.cpython-310.pyc +0 -0
  36. backend/original_src/__pycache__/about.cpython-311.pyc +0 -0
  37. backend/original_src/__pycache__/about.cpython-312.pyc +0 -0
  38. backend/original_src/__pycache__/envs.cpython-310.pyc +0 -0
  39. backend/original_src/__pycache__/envs.cpython-312.pyc +0 -0
  40. backend/original_src/__pycache__/populate.cpython-310.pyc +0 -0
  41. backend/original_src/__pycache__/populate.cpython-311.pyc +0 -0
  42. backend/original_src/__pycache__/populate.cpython-312.pyc +0 -0
  43. backend/original_src/__pycache__/populate.cpython-38.pyc +0 -0
  44. backend/original_src/about.py +75 -0
  45. backend/original_src/display/__pycache__/css_html_js.cpython-310.pyc +0 -0
  46. backend/original_src/display/__pycache__/css_html_js.cpython-311.pyc +0 -0
  47. backend/original_src/display/__pycache__/css_html_js.cpython-312.pyc +0 -0
  48. backend/original_src/display/__pycache__/formatting.cpython-310.pyc +0 -0
  49. backend/original_src/display/__pycache__/formatting.cpython-311.pyc +0 -0
  50. backend/original_src/display/__pycache__/formatting.cpython-312.pyc +0 -0
DEPLOYMENT.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Guide for Hugging Face Spaces
2
+
3
+ This repository is structured for deployment on Hugging Face Spaces using Docker.
4
+
5
+ ## Repository Structure
6
+
7
+ ```
8
+ icelandic-llm-leaderboard-hf/
9
+ ├── README.md # HF Spaces metadata and description
10
+ ├── Dockerfile # Multi-stage Docker build
11
+ ├── LICENSE # Apache 2.0 license
12
+ ├── .gitignore # Git ignore patterns
13
+ ├── DEPLOYMENT.md # This file
14
+ ├── backend/ # FastAPI backend
15
+ │ ├── app/ # Application code
16
+ │ │ ├── api/ # API routes
17
+ │ │ ├── config/ # Configuration
18
+ │ │ ├── core/ # Core functionality
19
+ │ │ └── services/ # Business logic
20
+ │ ├── original_src/ # Original Icelandic leaderboard logic
21
+ │ └── pyproject.toml # Python dependencies
22
+ └── frontend/ # React frontend
23
+ ├── build/ # Production build artifacts
24
+ ├── src/ # Source code
25
+ ├── public/ # Static assets
26
+ ├── package.json # Node.js dependencies
27
+ └── server.js # Express production server
28
+ ```
29
+
30
+ ## HF Spaces Configuration
31
+
32
+ The README.md contains the required HF Spaces metadata:
33
+ - SDK: docker
34
+ - OAuth: enabled for HF authentication
35
+ - Tags: leaderboard, icelandic, language evaluation
36
+ - License: Apache 2.0
37
+
38
+ ## Deployment Process
39
+
40
+ 1. **Upload to HF Spaces**: Upload this entire directory structure to your HF Space
41
+ 2. **Environment Variables**: Set HF_TOKEN in your Space settings if needed
42
+ 3. **Build**: HF Spaces will automatically build using the Dockerfile
43
+ 4. **Access**: Your leaderboard will be available at your Space URL
44
+
45
+ ## Architecture
46
+
47
+ - **Frontend**: React SPA served by Express on port 7860
48
+ - **Backend**: FastAPI server on port 7861
49
+ - **Proxy**: Express proxies `/api/*` requests to FastAPI
50
+ - **Data**: Pulls from HF repositories (mideind/icelandic-llm-leaderboard-*)
51
+
52
+ ## Key Features
53
+
54
+ - Real-time leaderboard with Icelandic benchmarks
55
+ - Interactive filtering and search
56
+ - Model comparison and pinning
57
+ - Responsive design with dark/light themes
58
+ - Automatic data synchronization from HF repositories
59
+
60
+ ## Environment Variables
61
+
62
+ - `HF_TOKEN`: Hugging Face API token (optional, can use HF OAuth)
63
+ - `PORT`: Frontend server port (default: 7860)
64
+ - `INTERNAL_API_PORT`: Backend server port (default: 7861)
65
+
66
+ The application will automatically use HF OAuth for authentication when deployed on HF Spaces.
Dockerfile_ ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Build frontend
2
+ FROM node:18 as frontend-build
3
+ WORKDIR /app
4
+ COPY frontend/package*.json ./
5
+ RUN npm install
6
+ COPY frontend/ ./
7
+
8
+ RUN npm run build
9
+
10
+ # Build backend
11
+ FROM python:3.12-slim
12
+ WORKDIR /app
13
+
14
+ # Create non-root user
15
+ RUN useradd -m -u 1000 user
16
+
17
+ # Install poetry
18
+ RUN pip install poetry
19
+
20
+ # Create and configure cache directory
21
+ RUN mkdir -p /app/.cache && \
22
+ chown -R user:user /app
23
+
24
+ # Copy and install backend dependencies
25
+ COPY backend/pyproject.toml backend/poetry.lock* ./
26
+ RUN poetry config virtualenvs.create false \
27
+ && poetry install --no-interaction --no-ansi --no-root --only main
28
+
29
+ # Copy backend code
30
+ COPY backend/ .
31
+
32
+ # Install Node.js and npm
33
+ RUN apt-get update && apt-get install -y \
34
+ curl \
35
+ netcat-openbsd \
36
+ && curl -fsSL https://deb.nodesource.com/setup_18.x | bash - \
37
+ && apt-get install -y nodejs \
38
+ && rm -rf /var/lib/apt/lists/*
39
+
40
+ # Copy frontend server and build
41
+ COPY --from=frontend-build /app/build ./frontend/build
42
+ COPY --from=frontend-build /app/package*.json ./frontend/
43
+ COPY --from=frontend-build /app/server.js ./frontend/
44
+
45
+ # Install frontend production dependencies
46
+ WORKDIR /app/frontend
47
+ RUN npm install --production
48
+ WORKDIR /app
49
+
50
+ # Environment variables
51
+ ENV HF_HOME=/app/.cache \
52
+ HF_DATASETS_CACHE=/app/.cache \
53
+ INTERNAL_API_PORT=7861 \
54
+ PORT=7860 \
55
+ NODE_ENV=production
56
+
57
+ # Note: HF_TOKEN should be provided at runtime, not build time
58
+ USER user
59
+ EXPOSE 7860
60
+
61
+ # Start both servers with wait-for
62
+ CMD ["sh", "-c", "uvicorn app.asgi:app --host 0.0.0.0 --port 7861 & while ! nc -z localhost 7861; do sleep 1; done && cd frontend && npm run serve"]
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity granting the License.
13
+
14
+ "Legal Entity" shall mean the union of the acting entity and all
15
+ other entities that control, are controlled by, or are under common
16
+ control with that entity. For the purposes of this definition,
17
+ "control" means (i) the power, direct or indirect, to cause the
18
+ direction or management of such entity, whether by contract or
19
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
20
+ outstanding shares, or (iii) beneficial ownership of such entity.
21
+
22
+ "You" (or "Your") shall mean an individual or Legal Entity
23
+ exercising permissions granted by this License.
24
+
25
+ "Source" shall mean the preferred form for making modifications,
26
+ including but not limited to software source code, documentation
27
+ source, and configuration files.
28
+
29
+ "Object" shall mean any form resulting from mechanical
30
+ transformation or translation of a Source form, including but
31
+ not limited to compiled object code, generated documentation,
32
+ and conversions to other media types.
33
+
34
+ "Work" shall mean the work of authorship covered by this License,
35
+ whether in Source or Object form, made available under the License,
36
+ as indicated by a copyright notice that is included in or attached
37
+ to the work. (Such copyright notice may also be included in a file
38
+ accompanying the work.)
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based upon (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and derivative works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control
57
+ systems, and issue tracking systems that are managed by, or on behalf
58
+ of, the Licensor for the purpose of discussing and improving the Work,
59
+ but excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to use, reproduce, modify, display, perform,
70
+ sublicense, and distribute the Work and in such Derivative Works
71
+ in Source and Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, trademark, patent,
102
+ attribution and other notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright notice and/or license for Your
124
+ own additions to the Work, and may provide additional or different
125
+ license terms and conditions for use, reproduction, or distribution
126
+ of Your additions to the Work, or for any such Derivative Works as a
127
+ whole, provided Your use, reproduction, and distribution of the
128
+ Work otherwise complies with the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Support. When redistributing the Work or
166
+ Derivative Works thereof, You may choose to offer, and charge a fee
167
+ for, acceptance of support, warranty, indemnity, or other liability
168
+ obligations and/or rights consistent with this License. However, in
169
+ accepting such obligations, You may act only on Your own behalf and
170
+ on Your sole responsibility, not on behalf of any other Contributor,
171
+ and only if You agree to indemnify, defend, and hold each Contributor
172
+ harmless for any liability incurred by, or claims asserted against,
173
+ such Contributor by reason of your accepting any such warranty or
174
+ support.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same page as the copyright notice for easier identification within
187
+ third-party archives.
188
+
189
+ Copyright [2024] [Mideind ehf.]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
README.md CHANGED
@@ -1,10 +1,70 @@
1
  ---
2
- title: Icelandic Llm Leaderboard Hf
3
- emoji: 😻
4
- colorFrom: green
5
- colorTo: indigo
6
  sdk: docker
7
- pinned: false
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Icelandic LLM Leaderboard
3
+ emoji: 🇮🇸
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: docker
7
+ hf_oauth: true
8
+ pinned: true
9
+ license: apache-2.0
10
+ tags:
11
+ - leaderboard
12
+ - modality:text
13
+ - submission:automatic
14
+ - test:public
15
+ - language:icelandic
16
+ - eval:language
17
+ short_description: Track, rank and evaluate LLMs on Icelandic language tasks
18
  ---
19
 
20
+ # Icelandic LLM Leaderboard 🇮🇸
21
+
22
+ A comprehensive leaderboard for evaluating Large Language Models (LLMs) on Icelandic language tasks. This leaderboard tracks model performance across various Icelandic benchmarks including WinoGrande-IS, GED, Inflection, Belebele-IS, ARC-Challenge-IS, and WikiQA-IS.
23
+
24
+ ## Features
25
+
26
+ - 📊 Interactive table with advanced sorting and filtering
27
+ - 🔍 Semantic model search with regex support
28
+ - 📌 Pin models for easy comparison
29
+ - 📱 Responsive and modern React interface
30
+ - 🎨 Dark/Light mode support
31
+ - ⚡️ Optimized performance with virtualization
32
+ - 🇮🇸 Specialized for Icelandic language evaluation
33
+
34
+ ## Benchmarks
35
+
36
+ ### Core Icelandic Tasks
37
+ - **WinoGrande-IS (3-shot)**: Icelandic common sense reasoning
38
+ - **GED**: Grammatical error detection in Icelandic
39
+ - **Inflection (1-shot)**: Icelandic morphological inflection
40
+ - **Belebele-IS**: Icelandic reading comprehension
41
+ - **ARC-Challenge-IS**: Icelandic science questions
42
+ - **WikiQA-IS**: Icelandic question answering
43
+
44
+ ## Architecture
45
+
46
+ The leaderboard uses a modern React frontend with a FastAPI backend, containerized with Docker for seamless deployment on Hugging Face Spaces.
47
+
48
+ ### Frontend (React)
49
+ - Material-UI components
50
+ - TanStack Table for advanced data handling
51
+ - Real-time filtering and search capabilities
52
+
53
+ ### Backend (FastAPI)
54
+ - Integration with Hugging Face repositories
55
+ - Automatic data synchronization
56
+ - RESTful API endpoints
57
+
58
+ ## Data Sources
59
+
60
+ The leaderboard pulls evaluation results from:
61
+ - **Results Repository**: `mideind/icelandic-llm-leaderboard-results`
62
+ - **Requests Repository**: `mideind/icelandic-llm-leaderboard-requests`
63
+
64
+ ## Contributing
65
+
66
+ To submit a model for evaluation, please follow the submission guidelines in the leaderboard interface.
67
+
68
+ ## License
69
+
70
+ Apache 2.0 License - see LICENSE file for details.
backend/app/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Icelandic LLM Leaderboard Backend
backend/app/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (218 Bytes). View file
 
backend/app/__pycache__/__init__.cpython-312.pyc ADDED
Binary file (117 Bytes). View file
 
backend/app/__pycache__/asgi.cpython-312.pyc ADDED
Binary file (3.52 kB). View file
 
backend/app/api/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from .router import router
2
+
3
+ __all__ = ["router"]
backend/app/api/__pycache__/__init__.cpython-312.pyc ADDED
Binary file (180 Bytes). View file
 
backend/app/api/__pycache__/router.cpython-312.pyc ADDED
Binary file (380 Bytes). View file
 
backend/app/api/endpoints/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from .leaderboard import router as leaderboard_router
2
+
3
+ __all__ = ["leaderboard_router"]
backend/app/api/endpoints/__pycache__/__init__.cpython-312.pyc ADDED
Binary file (224 Bytes). View file
 
backend/app/api/endpoints/__pycache__/leaderboard.cpython-312.pyc ADDED
Binary file (3.12 kB). View file
 
backend/app/api/endpoints/leaderboard.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter
2
+ from typing import List, Dict, Any
3
+ import logging
4
+
5
+ from app.services.leaderboard import IcelandicLeaderboardService
6
+ from app.core.fastapi_cache import cached, build_cache_key
7
+
8
+ logger = logging.getLogger(__name__)
9
+ router = APIRouter()
10
+ leaderboard_service = IcelandicLeaderboardService()
11
+
12
+ def leaderboard_key_builder(func, namespace: str = "icelandic_leaderboard", **kwargs):
13
+ """Build cache key for Icelandic leaderboard data"""
14
+ key_type = "raw" if func.__name__ == "get_leaderboard" else "formatted"
15
+ key = build_cache_key(namespace, key_type)
16
+ logger.debug(f"Built Icelandic leaderboard cache key: {key}")
17
+ return key
18
+
19
+ @router.get("")
20
+ @cached(expire=300, key_builder=leaderboard_key_builder)
21
+ async def get_leaderboard() -> List[Dict[str, Any]]:
22
+ """
23
+ Get raw Icelandic leaderboard data
24
+ Response will be automatically GZIP compressed if size > 500 bytes
25
+ """
26
+ try:
27
+ logger.info("Fetching raw Icelandic leaderboard data")
28
+ data = await leaderboard_service.fetch_raw_data()
29
+ logger.info(f"Retrieved {len(data)} Icelandic leaderboard entries")
30
+ return data
31
+ except Exception as e:
32
+ logger.error(f"Failed to fetch raw Icelandic leaderboard data: {e}")
33
+ raise
34
+
35
+ @router.get("/formatted")
36
+ @cached(expire=300, key_builder=leaderboard_key_builder)
37
+ async def get_formatted_leaderboard() -> List[Dict[str, Any]]:
38
+ """
39
+ Get formatted Icelandic leaderboard data with restructured objects
40
+ Response will be automatically GZIP compressed if size > 500 bytes
41
+ """
42
+ try:
43
+ logger.info("Fetching formatted Icelandic leaderboard data")
44
+ data = await leaderboard_service.get_formatted_data()
45
+ logger.info(f"Retrieved {len(data)} formatted Icelandic entries")
46
+ return data
47
+ except Exception as e:
48
+ logger.error(f"Failed to fetch formatted Icelandic leaderboard data: {e}")
49
+ raise
backend/app/api/router.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter
2
+ from app.api.endpoints import leaderboard_router
3
+
4
+ router = APIRouter()
5
+
6
+ # Include all endpoint routers
7
+ router.include_router(leaderboard_router, prefix="/leaderboard", tags=["leaderboard"])
backend/app/asgi.py ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ASGI entry point for the Icelandic LLM Leaderboard API.
3
+ """
4
+ import os
5
+ import logging
6
+ import logging.config
7
+ from fastapi import FastAPI
8
+ from fastapi.middleware.cors import CORSMiddleware
9
+ from fastapi.middleware.gzip import GZipMiddleware
10
+
11
+ from app.api.router import router
12
+ from app.core.fastapi_cache import setup_cache
13
+ from app.config import hf_config
14
+
15
+ # Configure logging
16
+ LOGGING_CONFIG = {
17
+ "version": 1,
18
+ "disable_existing_loggers": True,
19
+ "formatters": {
20
+ "default": {
21
+ "format": "%(name)s - %(levelname)s - %(message)s",
22
+ }
23
+ },
24
+ "handlers": {
25
+ "default": {
26
+ "formatter": "default",
27
+ "class": "logging.StreamHandler",
28
+ "stream": "ext://sys.stdout",
29
+ }
30
+ },
31
+ "loggers": {
32
+ "uvicorn": {
33
+ "handlers": ["default"],
34
+ "level": "WARNING",
35
+ "propagate": False,
36
+ },
37
+ "uvicorn.error": {
38
+ "level": "WARNING",
39
+ "handlers": ["default"],
40
+ "propagate": False,
41
+ },
42
+ "uvicorn.access": {
43
+ "handlers": ["default"],
44
+ "level": "WARNING",
45
+ "propagate": False,
46
+ },
47
+ "app": {
48
+ "handlers": ["default"],
49
+ "level": "INFO",
50
+ "propagate": False,
51
+ }
52
+ },
53
+ "root": {
54
+ "handlers": ["default"],
55
+ "level": "INFO",
56
+ }
57
+ }
58
+
59
+ # Apply logging configuration
60
+ logging.config.dictConfig(LOGGING_CONFIG)
61
+ logger = logging.getLogger("app")
62
+
63
+ # Create FastAPI application
64
+ app = FastAPI(
65
+ title="Icelandic LLM Leaderboard",
66
+ version="1.0.0",
67
+ docs_url="/docs",
68
+ )
69
+
70
+ # Add CORS middleware
71
+ app.add_middleware(
72
+ CORSMiddleware,
73
+ allow_origins=["*"],
74
+ allow_credentials=True,
75
+ allow_methods=["*"],
76
+ allow_headers=["*"],
77
+ )
78
+
79
+ # Add GZIP compression
80
+ app.add_middleware(GZipMiddleware, minimum_size=500)
81
+
82
+ # Include API router
83
+ app.include_router(router, prefix="/api")
84
+
85
+ @app.on_event("startup")
86
+ async def startup_event():
87
+ """Initialize services on startup"""
88
+ logger.info("🇮🇸 ICELANDIC LLM LEADERBOARD STARTING UP")
89
+
90
+ # Log HF configuration
91
+ logger.info(f"Organization: {hf_config.HF_ORGANIZATION}")
92
+ logger.info(f"Token Status: {'Present' if hf_config.HF_TOKEN else 'Missing'}")
93
+ logger.info(f"Using repositories:")
94
+ logger.info(f" - Queue: {hf_config.QUEUE_REPO}")
95
+ logger.info(f" - Results: {hf_config.RESULTS_REPO}")
96
+
97
+ # Setup cache
98
+ setup_cache()
99
+ logger.info("FastAPI Cache initialized")
100
+
101
+ logger.info("🚀 Icelandic LLM Leaderboard ready!")
102
+
103
+ @app.get("/")
104
+ async def root():
105
+ """Root endpoint"""
106
+ return {"message": "Icelandic LLM Leaderboard API", "status": "running"}
107
+
108
+ @app.get("/health")
109
+ async def health_check():
110
+ """Health check endpoint"""
111
+ return {"status": "healthy"}
backend/app/config/__init__.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from .hf_config import (
2
+ HF_TOKEN,
3
+ HF_ORGANIZATION,
4
+ REPO_ID,
5
+ QUEUE_REPO,
6
+ RESULTS_REPO,
7
+ EVAL_REQUESTS_PATH,
8
+ EVAL_RESULTS_PATH,
9
+ API
10
+ )
11
+
12
+ __all__ = [
13
+ "HF_TOKEN",
14
+ "HF_ORGANIZATION",
15
+ "REPO_ID",
16
+ "QUEUE_REPO",
17
+ "RESULTS_REPO",
18
+ "EVAL_REQUESTS_PATH",
19
+ "EVAL_RESULTS_PATH",
20
+ "API"
21
+ ]
backend/app/config/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (554 Bytes). View file
 
backend/app/config/__pycache__/__init__.cpython-312.pyc ADDED
Binary file (364 Bytes). View file
 
backend/app/config/__pycache__/hf_config.cpython-311.pyc ADDED
Binary file (1.95 kB). View file
 
backend/app/config/__pycache__/hf_config.cpython-312.pyc ADDED
Binary file (1.66 kB). View file
 
backend/app/config/hf_config.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from pathlib import Path
3
+ from huggingface_hub import HfApi
4
+
5
+ # Load environment variables from .env file
6
+ try:
7
+ from dotenv import load_dotenv
8
+ # Look for .env file in the project root
9
+ env_path = Path(__file__).parent.parent.parent / ".env"
10
+ if env_path.exists():
11
+ load_dotenv(env_path)
12
+ print(f"Loaded .env from: {env_path}")
13
+ else:
14
+ # Try loading from current directory
15
+ load_dotenv()
16
+ print("Loaded .env from current directory")
17
+ except ImportError:
18
+ print("python-dotenv not available, using system environment only")
19
+
20
+ # Configuration for Icelandic LLM Leaderboard
21
+ HF_TOKEN = os.environ.get("HF_TOKEN")
22
+ HF_ORGANIZATION = "mideind"
23
+
24
+ # Debug: Print token status (first 10 chars only for security)
25
+ if HF_TOKEN:
26
+ print(f"HF_TOKEN loaded: {HF_TOKEN[:10]}...")
27
+ else:
28
+ print("HF_TOKEN not found in environment")
29
+
30
+ # Repository configuration
31
+ REPO_ID = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard"
32
+ QUEUE_REPO = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard-requests"
33
+ RESULTS_REPO = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard-results"
34
+
35
+ # Local cache paths
36
+ HF_HOME = os.getenv("HF_HOME", ".")
37
+ EVAL_REQUESTS_PATH = os.path.join(HF_HOME, "eval-queue")
38
+ EVAL_RESULTS_PATH = os.path.join(HF_HOME, "eval-results")
39
+
40
+ # API instance
41
+ API = HfApi(token=HF_TOKEN)
backend/app/core/__pycache__/cache.cpython-311.pyc ADDED
Binary file (2.46 kB). View file
 
backend/app/core/__pycache__/cache.cpython-312.pyc ADDED
Binary file (2.02 kB). View file
 
backend/app/core/__pycache__/fastapi_cache.cpython-312.pyc ADDED
Binary file (3.35 kB). View file
 
backend/app/core/cache.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from pathlib import Path
3
+ from datetime import timedelta
4
+
5
+ class CacheConfig:
6
+ def __init__(self):
7
+ self.base_path = Path(os.getenv("HF_HOME", "."))
8
+ self.cache_ttl = timedelta(minutes=5) # 5 minute cache TTL
9
+
10
+ def get_cache_path(self, cache_type: str = "datasets") -> Path:
11
+ """Get cache path for different cache types"""
12
+ cache_path = self.base_path / "cache" / cache_type
13
+ cache_path.mkdir(parents=True, exist_ok=True)
14
+ return cache_path
15
+
16
+ def flush_cache(self, cache_type: str = None):
17
+ """Flush specific cache or all caches"""
18
+ if cache_type:
19
+ cache_path = self.get_cache_path(cache_type)
20
+ for file in cache_path.glob("*"):
21
+ if file.is_file():
22
+ file.unlink()
23
+ else:
24
+ cache_base = self.base_path / "cache"
25
+ if cache_base.exists():
26
+ for cache_dir in cache_base.iterdir():
27
+ if cache_dir.is_dir():
28
+ for file in cache_dir.glob("*"):
29
+ if file.is_file():
30
+ file.unlink()
31
+
32
+ # Global cache configuration
33
+ cache_config = CacheConfig()
backend/app/core/fastapi_cache.py ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import asyncio
2
+ import json
3
+ import hashlib
4
+ from typing import Any, Callable, Dict, Optional
5
+ from functools import wraps
6
+ import logging
7
+
8
+ logger = logging.getLogger(__name__)
9
+
10
+ # Simple in-memory cache
11
+ _cache: Dict[str, Any] = {}
12
+ _cache_lock = asyncio.Lock()
13
+
14
+ def build_cache_key(namespace: str, *args, **kwargs) -> str:
15
+ """Build a cache key from namespace and parameters"""
16
+ key_data = f"{namespace}:{args}:{sorted(kwargs.items())}"
17
+ return hashlib.md5(key_data.encode()).hexdigest()
18
+
19
+ def cached(expire: int = 300, key_builder: Optional[Callable] = None):
20
+ """
21
+ Cache decorator for FastAPI endpoints
22
+
23
+ Args:
24
+ expire: Cache expiration time in seconds
25
+ key_builder: Function to build cache key
26
+ """
27
+ def decorator(func: Callable) -> Callable:
28
+ @wraps(func)
29
+ async def wrapper(*args, **kwargs):
30
+ # Build cache key
31
+ if key_builder:
32
+ cache_key = key_builder(func, **kwargs)
33
+ else:
34
+ cache_key = build_cache_key(func.__name__, *args, **kwargs)
35
+
36
+ # Check cache
37
+ async with _cache_lock:
38
+ if cache_key in _cache:
39
+ cached_data, timestamp = _cache[cache_key]
40
+ import time
41
+ if time.time() - timestamp < expire:
42
+ logger.debug(f"Cache hit for key: {cache_key}")
43
+ return cached_data
44
+ else:
45
+ # Expired, remove from cache
46
+ del _cache[cache_key]
47
+
48
+ # Cache miss, execute function
49
+ logger.debug(f"Cache miss for key: {cache_key}")
50
+ result = await func(*args, **kwargs)
51
+
52
+ # Store in cache
53
+ async with _cache_lock:
54
+ import time
55
+ _cache[cache_key] = (result, time.time())
56
+
57
+ return result
58
+ return wrapper
59
+ return decorator
60
+
61
+ def setup_cache():
62
+ """Setup cache configuration"""
63
+ logger.info("FastAPI cache initialized with in-memory backend")
backend/app/services/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from .leaderboard import IcelandicLeaderboardService
2
+
3
+ __all__ = ["IcelandicLeaderboardService"]
backend/app/services/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (330 Bytes). View file
 
backend/app/services/__pycache__/__init__.cpython-312.pyc ADDED
Binary file (220 Bytes). View file
 
backend/app/services/__pycache__/leaderboard.cpython-311.pyc ADDED
Binary file (12.3 kB). View file
 
backend/app/services/__pycache__/leaderboard.cpython-312.pyc ADDED
Binary file (11.3 kB). View file
 
backend/app/services/leaderboard.py ADDED
@@ -0,0 +1,277 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import logging
4
+ from typing import List, Dict, Any
5
+ from pathlib import Path
6
+ from huggingface_hub import snapshot_download
7
+ from fastapi import HTTPException
8
+
9
+ from app.config import (
10
+ QUEUE_REPO,
11
+ RESULTS_REPO,
12
+ EVAL_REQUESTS_PATH,
13
+ EVAL_RESULTS_PATH,
14
+ HF_TOKEN
15
+ )
16
+ from app.core.cache import cache_config
17
+
18
+ logger = logging.getLogger(__name__)
19
+
20
+ # Import original processing logic
21
+ import sys
22
+ import os
23
+
24
+ # Add the original Icelandic leaderboard source to Python path
25
+ original_src_path = os.path.join(os.path.dirname(__file__), '..', '..', 'original_src')
26
+ if original_src_path not in sys.path:
27
+ sys.path.insert(0, original_src_path)
28
+
29
+ # Also add the parent directory so imports like 'src.display.utils' work
30
+ backend_path = os.path.join(os.path.dirname(__file__), '..', '..')
31
+ if backend_path not in sys.path:
32
+ sys.path.insert(0, backend_path)
33
+
34
+ try:
35
+ from leaderboard.read_evals import get_raw_eval_results
36
+ from populate import get_leaderboard_df
37
+ from display.utils import COLS, BENCHMARK_COLS, Tasks
38
+ except ImportError as e:
39
+ # Fallback for development without mounted volume
40
+ logger.warning(f"Could not import original modules: {e}")
41
+ # Define minimal fallbacks
42
+ COLS = ["Model", "Average ⬆️", "Type", "Precision", "Architecture", "Hub License", "Hub ❤️", "#Params (B)", "Available on the hub", "Model sha"]
43
+ BENCHMARK_COLS = ["WinoGrande-IS (3-shot)", "GED", "Inflection (1-shot)", "Belebele (IS)", "ARC-Challenge-IS", "WikiQA-IS"]
44
+
45
+ class MockTask:
46
+ def __init__(self, name, col_name):
47
+ self.name = name
48
+ self.col_name = col_name
49
+
50
+ class Tasks:
51
+ task0 = MockTask("winogrande_is", "WinoGrande-IS (3-shot)")
52
+ task1 = MockTask("ged", "GED")
53
+ task2 = MockTask("inflection", "Inflection (1-shot)")
54
+ task5 = MockTask("belebele_is", "Belebele (IS)")
55
+ task6 = MockTask("arc_challenge_is", "ARC-Challenge-IS")
56
+ task7 = MockTask("wiki_qa_is", "WikiQA-IS")
57
+
58
+ class IcelandicLeaderboardService:
59
+ def __init__(self):
60
+ self.results_path = EVAL_RESULTS_PATH
61
+ self.requests_path = EVAL_REQUESTS_PATH
62
+
63
+ async def _ensure_data_available(self):
64
+ """Ensure evaluation data is available locally"""
65
+ try:
66
+ # Download results if not exists or empty
67
+ if not os.path.exists(self.results_path) or not os.listdir(self.results_path):
68
+ logger.info(f"Downloading results to {self.results_path}")
69
+ snapshot_download(
70
+ repo_id=RESULTS_REPO,
71
+ local_dir=self.results_path,
72
+ repo_type="dataset",
73
+ token=HF_TOKEN,
74
+ tqdm_class=None,
75
+ etag_timeout=30
76
+ )
77
+
78
+ # Download requests if not exists or empty
79
+ if not os.path.exists(self.requests_path) or not os.listdir(self.requests_path):
80
+ logger.info(f"Downloading requests to {self.requests_path}")
81
+ snapshot_download(
82
+ repo_id=QUEUE_REPO,
83
+ local_dir=self.requests_path,
84
+ repo_type="dataset",
85
+ token=HF_TOKEN,
86
+ tqdm_class=None,
87
+ etag_timeout=30
88
+ )
89
+
90
+ except Exception as e:
91
+ logger.error(f"Failed to download data: {e}")
92
+ raise HTTPException(status_code=500, detail=f"Failed to download data: {str(e)}")
93
+
94
+ async def fetch_raw_data(self) -> List[Dict[str, Any]]:
95
+ """Fetch raw leaderboard data using original Icelandic processing logic"""
96
+ try:
97
+ await self._ensure_data_available()
98
+
99
+ logger.info("Processing Icelandic leaderboard data")
100
+
101
+ # Try to use original processing logic if available
102
+ try:
103
+ raw_data, df = get_leaderboard_df(
104
+ self.results_path,
105
+ self.requests_path,
106
+ COLS,
107
+ BENCHMARK_COLS
108
+ )
109
+
110
+ # Convert DataFrame to list of dictionaries
111
+ data = df.to_dict('records')
112
+
113
+ logger.info(f"Processed {len(data)} Icelandic leaderboard entries")
114
+ return data
115
+
116
+ except NameError:
117
+ # Fallback: return mock data for testing
118
+ logger.warning("Using mock data - original processing modules not available")
119
+ return self._generate_mock_data()
120
+
121
+ except Exception as e:
122
+ logger.error(f"Failed to fetch Icelandic leaderboard data: {e}")
123
+ raise HTTPException(status_code=500, detail=str(e))
124
+
125
+ def _generate_mock_data(self) -> List[Dict[str, Any]]:
126
+ """Generate mock data for testing when original modules aren't available"""
127
+ return [
128
+ {
129
+ "Model": "test-model/icelandic-gpt-7b",
130
+ "Average ⬆️": 85.5,
131
+ "Type": "fine-tuned",
132
+ "T": "🔶",
133
+ "Precision": "bfloat16",
134
+ "Architecture": "LlamaForCausalLM",
135
+ "Hub License": "apache-2.0",
136
+ "Hub ❤️": 42,
137
+ "#Params (B)": 7.0,
138
+ "Available on the hub": True,
139
+ "Model sha": "abc123def456",
140
+ "WinoGrande-IS (3-shot)": 78.5,
141
+ "GED": 92.3,
142
+ "Inflection (1-shot)": 85.1,
143
+ "Belebele (IS)": 80.7,
144
+ "ARC-Challenge-IS": 76.2,
145
+ "WikiQA-IS": 89.4
146
+ },
147
+ {
148
+ "Model": "test-model/icelandic-llama-13b",
149
+ "Average ⬆️": 88.2,
150
+ "Type": "instruction-tuned",
151
+ "T": "⭕",
152
+ "Precision": "float16",
153
+ "Architecture": "LlamaForCausalLM",
154
+ "Hub License": "mit",
155
+ "Hub ❤️": 156,
156
+ "#Params (B)": 13.0,
157
+ "Available on the hub": True,
158
+ "Model sha": "def456abc789",
159
+ "WinoGrande-IS (3-shot)": 82.1,
160
+ "GED": 94.8,
161
+ "Inflection (1-shot)": 87.9,
162
+ "Belebele (IS)": 85.3,
163
+ "ARC-Challenge-IS": 79.8,
164
+ "WikiQA-IS": 91.2
165
+ }
166
+ ]
167
+
168
+ async def get_formatted_data(self) -> List[Dict[str, Any]]:
169
+ """Get formatted leaderboard data compatible with React frontend"""
170
+ try:
171
+ raw_data = await self.fetch_raw_data()
172
+ formatted_data = []
173
+
174
+ for item in raw_data:
175
+ try:
176
+ formatted_item = await self.transform_data(item)
177
+ formatted_data.append(formatted_item)
178
+ except Exception as e:
179
+ logger.error(f"Failed to format entry: {e}")
180
+ continue
181
+
182
+ logger.info(f"Formatted {len(formatted_data)} entries for frontend")
183
+ return formatted_data
184
+
185
+ except Exception as e:
186
+ logger.error(f"Failed to format leaderboard data: {e}")
187
+ raise HTTPException(status_code=500, detail=str(e))
188
+
189
+ async def transform_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
190
+ """Transform Icelandic leaderboard data into format expected by React frontend"""
191
+
192
+ # Create unique ID and clean model name
193
+ raw_model_name = data.get("Model", "Unknown")
194
+
195
+ # Extract clean model name from HTML if present
196
+ if '<a target="_blank" href=' in raw_model_name:
197
+ # Parse HTML to extract clean model name
198
+ import re
199
+ match = re.search(r'>([^<]+)</a>', raw_model_name)
200
+ model_name = match.group(1) if match else raw_model_name
201
+ else:
202
+ model_name = raw_model_name
203
+
204
+ precision = data.get("Precision", "Unknown")
205
+ revision = data.get("Model sha", "Unknown")
206
+ unique_id = f"{model_name}_{precision}_{revision}"
207
+
208
+ # Map Icelandic tasks to evaluations format
209
+ evaluations = {}
210
+ task_mapping = {
211
+ "WinoGrande-IS (3-shot)": "winogrande_is",
212
+ "GED": "ged",
213
+ "Inflection (1-shot)": "inflection",
214
+ "Belebele (IS)": "belebele_is",
215
+ "ARC-Challenge-IS": "arc_challenge_is",
216
+ "WikiQA-IS": "wiki_qa_is"
217
+ }
218
+
219
+ for task_display_name, task_key in task_mapping.items():
220
+ if task_display_name in data:
221
+ evaluations[task_key] = {
222
+ "name": task_display_name,
223
+ "value": data.get(task_display_name, 0),
224
+ "normalized_score": data.get(task_display_name, 0)
225
+ }
226
+
227
+ # Extract model type and clean it
228
+ model_type_symbol = data.get("T", "")
229
+ model_type_name = data.get("Type", "Unknown")
230
+
231
+ # Map Icelandic model types to frontend format
232
+ type_mapping = {
233
+ "pretrained": "pretrained",
234
+ "fine-tuned": "fine-tuned",
235
+ "instruction-tuned": "instruction-tuned",
236
+ "RL-tuned": "RL-tuned"
237
+ }
238
+
239
+ clean_model_type = type_mapping.get(model_type_name, model_type_name)
240
+
241
+ features = {
242
+ "is_not_available_on_hub": not data.get("Available on the hub", True),
243
+ "is_merged": False, # Not tracked in Icelandic leaderboard
244
+ "is_moe": False, # Not tracked in Icelandic leaderboard
245
+ "is_flagged": False, # Not tracked in Icelandic leaderboard
246
+ "is_official_provider": False # Not tracked in Icelandic leaderboard
247
+ }
248
+
249
+ metadata = {
250
+ "upload_date": None, # Not available in Icelandic data
251
+ "submission_date": None, # Not available in Icelandic data
252
+ "generation": None, # Not available in Icelandic data
253
+ "base_model": None, # Not available in Icelandic data
254
+ "hub_license": data.get("Hub License", ""),
255
+ "hub_hearts": data.get("Hub ❤️", 0),
256
+ "params_billions": data.get("#Params (B)", 0),
257
+ "co2_cost": 0 # Not tracked in Icelandic leaderboard
258
+ }
259
+
260
+ transformed_data = {
261
+ "id": unique_id,
262
+ "model": {
263
+ "name": model_name,
264
+ "sha": revision,
265
+ "precision": precision,
266
+ "type": clean_model_type,
267
+ "weight_type": None, # Not available in Icelandic data
268
+ "architecture": data.get("Architecture", "Unknown"),
269
+ "average_score": data.get("Average ⬆️", 0),
270
+ "has_chat_template": False # Not tracked in Icelandic leaderboard
271
+ },
272
+ "evaluations": evaluations,
273
+ "features": features,
274
+ "metadata": metadata
275
+ }
276
+
277
+ return transformed_data
backend/original_src/__pycache__/about.cpython-310.pyc ADDED
Binary file (4.26 kB). View file
 
backend/original_src/__pycache__/about.cpython-311.pyc ADDED
Binary file (4.76 kB). View file
 
backend/original_src/__pycache__/about.cpython-312.pyc ADDED
Binary file (4.49 kB). View file
 
backend/original_src/__pycache__/envs.cpython-310.pyc ADDED
Binary file (777 Bytes). View file
 
backend/original_src/__pycache__/envs.cpython-312.pyc ADDED
Binary file (1.19 kB). View file
 
backend/original_src/__pycache__/populate.cpython-310.pyc ADDED
Binary file (1.06 kB). View file
 
backend/original_src/__pycache__/populate.cpython-311.pyc ADDED
Binary file (1.6 kB). View file
 
backend/original_src/__pycache__/populate.cpython-312.pyc ADDED
Binary file (1.33 kB). View file
 
backend/original_src/__pycache__/populate.cpython-38.pyc ADDED
Binary file (1.04 kB). View file
 
backend/original_src/about.py ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dataclasses import dataclass
2
+ from enum import Enum
3
+
4
+ @dataclass
5
+ class Task:
6
+ benchmark: str
7
+ metric: str
8
+ col_name: str
9
+
10
+
11
+ # Select your tasks here
12
+ # ---------------------------------------------------
13
+ class Tasks(Enum):
14
+ # task_key in the json file, metric_key in the json file, name to display in the leaderboard
15
+ task0 = Task("icelandic_winogrande_stringmatch", "exact_match,get-answer", "WinoGrande-IS (3-shot)")
16
+ task1 = Task("icelandic_sentences_ged_stringmatch", "exact_match,get-answer", "GED")
17
+ task2 = Task("icelandic_inflection_all", "exact_match,get-answer", "Inflection (1-shot)")
18
+ task5 = Task("icelandic_belebele", "exact_match,get-answer", "Belebele (IS)")
19
+ task6 = Task("icelandic_arc_challenge", "exact_match,get-answer", "ARC-Challenge-IS")
20
+ task7 = Task("icelandic_wiki_qa", "lm_judge_score,get-answer", "WikiQA-IS")
21
+
22
+ # ---------------------------------------------------
23
+
24
+
25
+
26
+ # Your leaderboard name
27
+ TITLE = """<h1 align="center" id="space-title">Icelandic LLM leaderboard</h1>"""
28
+
29
+ # What does your leaderboard evaluate?
30
+ INTRODUCTION_TEXT = """
31
+ """
32
+
33
+ # Which evaluations are you running? how can people reproduce what you have?
34
+ LLM_BENCHMARKS_TEXT = f"""
35
+ ## New submissions
36
+ Do you want your model to be included on the leaderboard? Open a discussion on this repository with the details of your model and we will get back to you.
37
+
38
+ ## Benchmark tasks
39
+ The Icelandic LLM leaderboard evaluates models on several tasks. All of them are set up as generation tasks, where the model's output is compared to the expected output.
40
+ This means that models that have not been instruction fine-tuned might perform poorly on these tasks.
41
+
42
+ The following tasks are evaluated:
43
+
44
+ ### WinoGrande-IS
45
+ The Icelandic WinoGrande task is a human-translated and localized version of the ~1000 test set examples in the WinoGrande task in English.
46
+ Each example consists of a sentence with a blank, and two answer choices for the blank. The task is to choose the correct answer choice using coreference resolution.
47
+ The benchmark is designed to test the model's ability to use knowledge and common sense reasoning in Icelandic. For this benchmark, we use 3-shot evaluation.
48
+ The Icelandic WinoGrande dataset is described in more detail in the IceBERT paper (https://aclanthology.org/2022.lrec-1.464.pdf).
49
+ - Link to dataset: https://huggingface.co/datasets/mideind/icelandic-winogrande
50
+
51
+ ### GED
52
+ This is a benchmark for binary sentence-level Icelandic grammatical error detection, adapted from the Icelandic Error Corpus (IEC) and contains 200 examples.
53
+ Each example consists of a sentence that may contain one or more grammatical errors, and the task is to predict whether the sentence contains an error.
54
+ - Link to dataset: https://huggingface.co/datasets/mideind/icelandic-sentences-gec
55
+
56
+ ### Inflection benchmark
57
+ The inflection benchmark tests models' ability to generate inflected forms of 300 Icelandic adjective-noun pairs for all four cases, singular and plural.
58
+ - Link to dataset: https://huggingface.co/datasets/mideind/icelandic-inflection-all-flat
59
+
60
+ ### Belebele (IS)
61
+ This is the Icelandic subset (900 examples) of the Belebele benchmark, a multiple-choice reading comprehension task. The task is to answer questions about a given passage.
62
+ - Link to dataset: https://huggingface.co/datasets/facebook/belebele
63
+
64
+ ### ARC-Challenge-IS
65
+ A machine-translated version of the ARC-Challenge multiple-choice question-answering dataset. For this benchmark, we use the test set which contains 1.23k examples.
66
+ - Link to dataset: https://huggingface.co/datasets/mideind/icelandic-arc-challenge
67
+
68
+ ### WikiQA-IS
69
+ The Icelandic WikiQA dataset is a collection of 1.9k question-answer pairs from the Icelandic Wikipedia, meant to evaluate models' knowledge of Icelandic culture and history.
70
+ They were collected by making GPT-4o generate questions and anwswers
71
+ given Icelandic Wikipedia articles as context. All examples were then manually verified and corrected where necessary. For evaluation, we prompt GPT-4o to
72
+ compare the generated answer to the original answer for semantic similarity and rate the answer on the following scale: (0, "poor"), (1, "fair"), (2, "excellent").
73
+ - Link to dataset: https://huggingface.co/datasets/mideind/icelandic_wiki_qa
74
+ """
75
+
backend/original_src/display/__pycache__/css_html_js.cpython-310.pyc ADDED
Binary file (1.94 kB). View file
 
backend/original_src/display/__pycache__/css_html_js.cpython-311.pyc ADDED
Binary file (1.96 kB). View file
 
backend/original_src/display/__pycache__/css_html_js.cpython-312.pyc ADDED
Binary file (1.95 kB). View file
 
backend/original_src/display/__pycache__/formatting.cpython-310.pyc ADDED
Binary file (1.44 kB). View file
 
backend/original_src/display/__pycache__/formatting.cpython-311.pyc ADDED
Binary file (2.01 kB). View file
 
backend/original_src/display/__pycache__/formatting.cpython-312.pyc ADDED
Binary file (1.8 kB). View file