Commit
·
1d31670
1
Parent(s):
4da0dd8
major refactor
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- DEPLOYMENT.md +66 -0
- Dockerfile_ +62 -0
- LICENSE +201 -0
- README.md +66 -6
- backend/app/__init__.py +1 -0
- backend/app/__pycache__/__init__.cpython-311.pyc +0 -0
- backend/app/__pycache__/__init__.cpython-312.pyc +0 -0
- backend/app/__pycache__/asgi.cpython-312.pyc +0 -0
- backend/app/api/__init__.py +3 -0
- backend/app/api/__pycache__/__init__.cpython-312.pyc +0 -0
- backend/app/api/__pycache__/router.cpython-312.pyc +0 -0
- backend/app/api/endpoints/__init__.py +3 -0
- backend/app/api/endpoints/__pycache__/__init__.cpython-312.pyc +0 -0
- backend/app/api/endpoints/__pycache__/leaderboard.cpython-312.pyc +0 -0
- backend/app/api/endpoints/leaderboard.py +49 -0
- backend/app/api/router.py +7 -0
- backend/app/asgi.py +111 -0
- backend/app/config/__init__.py +21 -0
- backend/app/config/__pycache__/__init__.cpython-311.pyc +0 -0
- backend/app/config/__pycache__/__init__.cpython-312.pyc +0 -0
- backend/app/config/__pycache__/hf_config.cpython-311.pyc +0 -0
- backend/app/config/__pycache__/hf_config.cpython-312.pyc +0 -0
- backend/app/config/hf_config.py +41 -0
- backend/app/core/__pycache__/cache.cpython-311.pyc +0 -0
- backend/app/core/__pycache__/cache.cpython-312.pyc +0 -0
- backend/app/core/__pycache__/fastapi_cache.cpython-312.pyc +0 -0
- backend/app/core/cache.py +33 -0
- backend/app/core/fastapi_cache.py +63 -0
- backend/app/services/__init__.py +3 -0
- backend/app/services/__pycache__/__init__.cpython-311.pyc +0 -0
- backend/app/services/__pycache__/__init__.cpython-312.pyc +0 -0
- backend/app/services/__pycache__/leaderboard.cpython-311.pyc +0 -0
- backend/app/services/__pycache__/leaderboard.cpython-312.pyc +0 -0
- backend/app/services/leaderboard.py +277 -0
- backend/original_src/__pycache__/about.cpython-310.pyc +0 -0
- backend/original_src/__pycache__/about.cpython-311.pyc +0 -0
- backend/original_src/__pycache__/about.cpython-312.pyc +0 -0
- backend/original_src/__pycache__/envs.cpython-310.pyc +0 -0
- backend/original_src/__pycache__/envs.cpython-312.pyc +0 -0
- backend/original_src/__pycache__/populate.cpython-310.pyc +0 -0
- backend/original_src/__pycache__/populate.cpython-311.pyc +0 -0
- backend/original_src/__pycache__/populate.cpython-312.pyc +0 -0
- backend/original_src/__pycache__/populate.cpython-38.pyc +0 -0
- backend/original_src/about.py +75 -0
- backend/original_src/display/__pycache__/css_html_js.cpython-310.pyc +0 -0
- backend/original_src/display/__pycache__/css_html_js.cpython-311.pyc +0 -0
- backend/original_src/display/__pycache__/css_html_js.cpython-312.pyc +0 -0
- backend/original_src/display/__pycache__/formatting.cpython-310.pyc +0 -0
- backend/original_src/display/__pycache__/formatting.cpython-311.pyc +0 -0
- backend/original_src/display/__pycache__/formatting.cpython-312.pyc +0 -0
DEPLOYMENT.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Deployment Guide for Hugging Face Spaces
|
2 |
+
|
3 |
+
This repository is structured for deployment on Hugging Face Spaces using Docker.
|
4 |
+
|
5 |
+
## Repository Structure
|
6 |
+
|
7 |
+
```
|
8 |
+
icelandic-llm-leaderboard-hf/
|
9 |
+
├── README.md # HF Spaces metadata and description
|
10 |
+
├── Dockerfile # Multi-stage Docker build
|
11 |
+
├── LICENSE # Apache 2.0 license
|
12 |
+
├── .gitignore # Git ignore patterns
|
13 |
+
├── DEPLOYMENT.md # This file
|
14 |
+
├── backend/ # FastAPI backend
|
15 |
+
│ ├── app/ # Application code
|
16 |
+
│ │ ├── api/ # API routes
|
17 |
+
│ │ ├── config/ # Configuration
|
18 |
+
│ │ ├── core/ # Core functionality
|
19 |
+
│ │ └── services/ # Business logic
|
20 |
+
│ ├── original_src/ # Original Icelandic leaderboard logic
|
21 |
+
│ └── pyproject.toml # Python dependencies
|
22 |
+
└── frontend/ # React frontend
|
23 |
+
├── build/ # Production build artifacts
|
24 |
+
├── src/ # Source code
|
25 |
+
├── public/ # Static assets
|
26 |
+
├── package.json # Node.js dependencies
|
27 |
+
└── server.js # Express production server
|
28 |
+
```
|
29 |
+
|
30 |
+
## HF Spaces Configuration
|
31 |
+
|
32 |
+
The README.md contains the required HF Spaces metadata:
|
33 |
+
- SDK: docker
|
34 |
+
- OAuth: enabled for HF authentication
|
35 |
+
- Tags: leaderboard, icelandic, language evaluation
|
36 |
+
- License: Apache 2.0
|
37 |
+
|
38 |
+
## Deployment Process
|
39 |
+
|
40 |
+
1. **Upload to HF Spaces**: Upload this entire directory structure to your HF Space
|
41 |
+
2. **Environment Variables**: Set HF_TOKEN in your Space settings if needed
|
42 |
+
3. **Build**: HF Spaces will automatically build using the Dockerfile
|
43 |
+
4. **Access**: Your leaderboard will be available at your Space URL
|
44 |
+
|
45 |
+
## Architecture
|
46 |
+
|
47 |
+
- **Frontend**: React SPA served by Express on port 7860
|
48 |
+
- **Backend**: FastAPI server on port 7861
|
49 |
+
- **Proxy**: Express proxies `/api/*` requests to FastAPI
|
50 |
+
- **Data**: Pulls from HF repositories (mideind/icelandic-llm-leaderboard-*)
|
51 |
+
|
52 |
+
## Key Features
|
53 |
+
|
54 |
+
- Real-time leaderboard with Icelandic benchmarks
|
55 |
+
- Interactive filtering and search
|
56 |
+
- Model comparison and pinning
|
57 |
+
- Responsive design with dark/light themes
|
58 |
+
- Automatic data synchronization from HF repositories
|
59 |
+
|
60 |
+
## Environment Variables
|
61 |
+
|
62 |
+
- `HF_TOKEN`: Hugging Face API token (optional, can use HF OAuth)
|
63 |
+
- `PORT`: Frontend server port (default: 7860)
|
64 |
+
- `INTERNAL_API_PORT`: Backend server port (default: 7861)
|
65 |
+
|
66 |
+
The application will automatically use HF OAuth for authentication when deployed on HF Spaces.
|
Dockerfile_
ADDED
@@ -0,0 +1,62 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Build frontend
|
2 |
+
FROM node:18 as frontend-build
|
3 |
+
WORKDIR /app
|
4 |
+
COPY frontend/package*.json ./
|
5 |
+
RUN npm install
|
6 |
+
COPY frontend/ ./
|
7 |
+
|
8 |
+
RUN npm run build
|
9 |
+
|
10 |
+
# Build backend
|
11 |
+
FROM python:3.12-slim
|
12 |
+
WORKDIR /app
|
13 |
+
|
14 |
+
# Create non-root user
|
15 |
+
RUN useradd -m -u 1000 user
|
16 |
+
|
17 |
+
# Install poetry
|
18 |
+
RUN pip install poetry
|
19 |
+
|
20 |
+
# Create and configure cache directory
|
21 |
+
RUN mkdir -p /app/.cache && \
|
22 |
+
chown -R user:user /app
|
23 |
+
|
24 |
+
# Copy and install backend dependencies
|
25 |
+
COPY backend/pyproject.toml backend/poetry.lock* ./
|
26 |
+
RUN poetry config virtualenvs.create false \
|
27 |
+
&& poetry install --no-interaction --no-ansi --no-root --only main
|
28 |
+
|
29 |
+
# Copy backend code
|
30 |
+
COPY backend/ .
|
31 |
+
|
32 |
+
# Install Node.js and npm
|
33 |
+
RUN apt-get update && apt-get install -y \
|
34 |
+
curl \
|
35 |
+
netcat-openbsd \
|
36 |
+
&& curl -fsSL https://deb.nodesource.com/setup_18.x | bash - \
|
37 |
+
&& apt-get install -y nodejs \
|
38 |
+
&& rm -rf /var/lib/apt/lists/*
|
39 |
+
|
40 |
+
# Copy frontend server and build
|
41 |
+
COPY --from=frontend-build /app/build ./frontend/build
|
42 |
+
COPY --from=frontend-build /app/package*.json ./frontend/
|
43 |
+
COPY --from=frontend-build /app/server.js ./frontend/
|
44 |
+
|
45 |
+
# Install frontend production dependencies
|
46 |
+
WORKDIR /app/frontend
|
47 |
+
RUN npm install --production
|
48 |
+
WORKDIR /app
|
49 |
+
|
50 |
+
# Environment variables
|
51 |
+
ENV HF_HOME=/app/.cache \
|
52 |
+
HF_DATASETS_CACHE=/app/.cache \
|
53 |
+
INTERNAL_API_PORT=7861 \
|
54 |
+
PORT=7860 \
|
55 |
+
NODE_ENV=production
|
56 |
+
|
57 |
+
# Note: HF_TOKEN should be provided at runtime, not build time
|
58 |
+
USER user
|
59 |
+
EXPOSE 7860
|
60 |
+
|
61 |
+
# Start both servers with wait-for
|
62 |
+
CMD ["sh", "-c", "uvicorn app.asgi:app --host 0.0.0.0 --port 7861 & while ! nc -z localhost 7861; do sleep 1; done && cd frontend && npm run serve"]
|
LICENSE
ADDED
@@ -0,0 +1,201 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Apache License
|
2 |
+
Version 2.0, January 2004
|
3 |
+
http://www.apache.org/licenses/
|
4 |
+
|
5 |
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
6 |
+
|
7 |
+
1. Definitions.
|
8 |
+
|
9 |
+
"License" shall mean the terms and conditions for use, reproduction,
|
10 |
+
and distribution as defined by Sections 1 through 9 of this document.
|
11 |
+
|
12 |
+
"Licensor" shall mean the copyright owner or entity granting the License.
|
13 |
+
|
14 |
+
"Legal Entity" shall mean the union of the acting entity and all
|
15 |
+
other entities that control, are controlled by, or are under common
|
16 |
+
control with that entity. For the purposes of this definition,
|
17 |
+
"control" means (i) the power, direct or indirect, to cause the
|
18 |
+
direction or management of such entity, whether by contract or
|
19 |
+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
20 |
+
outstanding shares, or (iii) beneficial ownership of such entity.
|
21 |
+
|
22 |
+
"You" (or "Your") shall mean an individual or Legal Entity
|
23 |
+
exercising permissions granted by this License.
|
24 |
+
|
25 |
+
"Source" shall mean the preferred form for making modifications,
|
26 |
+
including but not limited to software source code, documentation
|
27 |
+
source, and configuration files.
|
28 |
+
|
29 |
+
"Object" shall mean any form resulting from mechanical
|
30 |
+
transformation or translation of a Source form, including but
|
31 |
+
not limited to compiled object code, generated documentation,
|
32 |
+
and conversions to other media types.
|
33 |
+
|
34 |
+
"Work" shall mean the work of authorship covered by this License,
|
35 |
+
whether in Source or Object form, made available under the License,
|
36 |
+
as indicated by a copyright notice that is included in or attached
|
37 |
+
to the work. (Such copyright notice may also be included in a file
|
38 |
+
accompanying the work.)
|
39 |
+
|
40 |
+
"Derivative Works" shall mean any work, whether in Source or Object
|
41 |
+
form, that is based upon (or derived from) the Work and for which the
|
42 |
+
editorial revisions, annotations, elaborations, or other modifications
|
43 |
+
represent, as a whole, an original work of authorship. For the purposes
|
44 |
+
of this License, Derivative Works shall not include works that remain
|
45 |
+
separable from, or merely link (or bind by name) to the interfaces of,
|
46 |
+
the Work and derivative works thereof.
|
47 |
+
|
48 |
+
"Contribution" shall mean any work of authorship, including
|
49 |
+
the original version of the Work and any modifications or additions
|
50 |
+
to that Work or Derivative Works thereof, that is intentionally
|
51 |
+
submitted to Licensor for inclusion in the Work by the copyright owner
|
52 |
+
or by an individual or Legal Entity authorized to submit on behalf of
|
53 |
+
the copyright owner. For the purposes of this definition, "submitted"
|
54 |
+
means any form of electronic, verbal, or written communication sent
|
55 |
+
to the Licensor or its representatives, including but not limited to
|
56 |
+
communication on electronic mailing lists, source code control
|
57 |
+
systems, and issue tracking systems that are managed by, or on behalf
|
58 |
+
of, the Licensor for the purpose of discussing and improving the Work,
|
59 |
+
but excluding communication that is conspicuously marked or otherwise
|
60 |
+
designated in writing by the copyright owner as "Not a Contribution."
|
61 |
+
|
62 |
+
"Contributor" shall mean Licensor and any individual or Legal Entity
|
63 |
+
on behalf of whom a Contribution has been received by Licensor and
|
64 |
+
subsequently incorporated within the Work.
|
65 |
+
|
66 |
+
2. Grant of Copyright License. Subject to the terms and conditions of
|
67 |
+
this License, each Contributor hereby grants to You a perpetual,
|
68 |
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
69 |
+
copyright license to use, reproduce, modify, display, perform,
|
70 |
+
sublicense, and distribute the Work and in such Derivative Works
|
71 |
+
in Source and Object form.
|
72 |
+
|
73 |
+
3. Grant of Patent License. Subject to the terms and conditions of
|
74 |
+
this License, each Contributor hereby grants to You a perpetual,
|
75 |
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
76 |
+
(except as stated in this section) patent license to make, have made,
|
77 |
+
use, offer to sell, sell, import, and otherwise transfer the Work,
|
78 |
+
where such license applies only to those patent claims licensable
|
79 |
+
by such Contributor that are necessarily infringed by their
|
80 |
+
Contribution(s) alone or by combination of their Contribution(s)
|
81 |
+
with the Work to which such Contribution(s) was submitted. If You
|
82 |
+
institute patent litigation against any entity (including a
|
83 |
+
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
84 |
+
or a Contribution incorporated within the Work constitutes direct
|
85 |
+
or contributory patent infringement, then any patent licenses
|
86 |
+
granted to You under this License for that Work shall terminate
|
87 |
+
as of the date such litigation is filed.
|
88 |
+
|
89 |
+
4. Redistribution. You may reproduce and distribute copies of the
|
90 |
+
Work or Derivative Works thereof in any medium, with or without
|
91 |
+
modifications, and in Source or Object form, provided that You
|
92 |
+
meet the following conditions:
|
93 |
+
|
94 |
+
(a) You must give any other recipients of the Work or
|
95 |
+
Derivative Works a copy of this License; and
|
96 |
+
|
97 |
+
(b) You must cause any modified files to carry prominent notices
|
98 |
+
stating that You changed the files; and
|
99 |
+
|
100 |
+
(c) You must retain, in the Source form of any Derivative Works
|
101 |
+
that You distribute, all copyright, trademark, patent,
|
102 |
+
attribution and other notices from the Source form of the Work,
|
103 |
+
excluding those notices that do not pertain to any part of
|
104 |
+
the Derivative Works; and
|
105 |
+
|
106 |
+
(d) If the Work includes a "NOTICE" text file as part of its
|
107 |
+
distribution, then any Derivative Works that You distribute must
|
108 |
+
include a readable copy of the attribution notices contained
|
109 |
+
within such NOTICE file, excluding those notices that do not
|
110 |
+
pertain to any part of the Derivative Works, in at least one
|
111 |
+
of the following places: within a NOTICE text file distributed
|
112 |
+
as part of the Derivative Works; within the Source form or
|
113 |
+
documentation, if provided along with the Derivative Works; or,
|
114 |
+
within a display generated by the Derivative Works, if and
|
115 |
+
wherever such third-party notices normally appear. The contents
|
116 |
+
of the NOTICE file are for informational purposes only and
|
117 |
+
do not modify the License. You may add Your own attribution
|
118 |
+
notices within Derivative Works that You distribute, alongside
|
119 |
+
or as an addendum to the NOTICE text from the Work, provided
|
120 |
+
that such additional attribution notices cannot be construed
|
121 |
+
as modifying the License.
|
122 |
+
|
123 |
+
You may add Your own copyright notice and/or license for Your
|
124 |
+
own additions to the Work, and may provide additional or different
|
125 |
+
license terms and conditions for use, reproduction, or distribution
|
126 |
+
of Your additions to the Work, or for any such Derivative Works as a
|
127 |
+
whole, provided Your use, reproduction, and distribution of the
|
128 |
+
Work otherwise complies with the conditions stated in this License.
|
129 |
+
|
130 |
+
5. Submission of Contributions. Unless You explicitly state otherwise,
|
131 |
+
any Contribution intentionally submitted for inclusion in the Work
|
132 |
+
by You to the Licensor shall be under the terms and conditions of
|
133 |
+
this License, without any additional terms or conditions.
|
134 |
+
Notwithstanding the above, nothing herein shall supersede or modify
|
135 |
+
the terms of any separate license agreement you may have executed
|
136 |
+
with Licensor regarding such Contributions.
|
137 |
+
|
138 |
+
6. Trademarks. This License does not grant permission to use the trade
|
139 |
+
names, trademarks, service marks, or product names of the Licensor,
|
140 |
+
except as required for reasonable and customary use in describing the
|
141 |
+
origin of the Work and reproducing the content of the NOTICE file.
|
142 |
+
|
143 |
+
7. Disclaimer of Warranty. Unless required by applicable law or
|
144 |
+
agreed to in writing, Licensor provides the Work (and each
|
145 |
+
Contributor provides its Contributions) on an "AS IS" BASIS,
|
146 |
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
147 |
+
implied, including, without limitation, any warranties or conditions
|
148 |
+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
149 |
+
PARTICULAR PURPOSE. You are solely responsible for determining the
|
150 |
+
appropriateness of using or redistributing the Work and assume any
|
151 |
+
risks associated with Your exercise of permissions under this License.
|
152 |
+
|
153 |
+
8. Limitation of Liability. In no event and under no legal theory,
|
154 |
+
whether in tort (including negligence), contract, or otherwise,
|
155 |
+
unless required by applicable law (such as deliberate and grossly
|
156 |
+
negligent acts) or agreed to in writing, shall any Contributor be
|
157 |
+
liable to You for damages, including any direct, indirect, special,
|
158 |
+
incidental, or consequential damages of any character arising as a
|
159 |
+
result of this License or out of the use or inability to use the
|
160 |
+
Work (including but not limited to damages for loss of goodwill,
|
161 |
+
work stoppage, computer failure or malfunction, or any and all
|
162 |
+
other commercial damages or losses), even if such Contributor
|
163 |
+
has been advised of the possibility of such damages.
|
164 |
+
|
165 |
+
9. Accepting Warranty or Support. When redistributing the Work or
|
166 |
+
Derivative Works thereof, You may choose to offer, and charge a fee
|
167 |
+
for, acceptance of support, warranty, indemnity, or other liability
|
168 |
+
obligations and/or rights consistent with this License. However, in
|
169 |
+
accepting such obligations, You may act only on Your own behalf and
|
170 |
+
on Your sole responsibility, not on behalf of any other Contributor,
|
171 |
+
and only if You agree to indemnify, defend, and hold each Contributor
|
172 |
+
harmless for any liability incurred by, or claims asserted against,
|
173 |
+
such Contributor by reason of your accepting any such warranty or
|
174 |
+
support.
|
175 |
+
|
176 |
+
END OF TERMS AND CONDITIONS
|
177 |
+
|
178 |
+
APPENDIX: How to apply the Apache License to your work.
|
179 |
+
|
180 |
+
To apply the Apache License to your work, attach the following
|
181 |
+
boilerplate notice, with the fields enclosed by brackets "[]"
|
182 |
+
replaced with your own identifying information. (Don't include
|
183 |
+
the brackets!) The text should be enclosed in the appropriate
|
184 |
+
comment syntax for the file format. We also recommend that a
|
185 |
+
file or class name and description of purpose be included on the
|
186 |
+
same page as the copyright notice for easier identification within
|
187 |
+
third-party archives.
|
188 |
+
|
189 |
+
Copyright [2024] [Mideind ehf.]
|
190 |
+
|
191 |
+
Licensed under the Apache License, Version 2.0 (the "License");
|
192 |
+
you may not use this file except in compliance with the License.
|
193 |
+
You may obtain a copy of the License at
|
194 |
+
|
195 |
+
http://www.apache.org/licenses/LICENSE-2.0
|
196 |
+
|
197 |
+
Unless required by applicable law or agreed to in writing, software
|
198 |
+
distributed under the License is distributed on an "AS IS" BASIS,
|
199 |
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
200 |
+
See the License for the specific language governing permissions and
|
201 |
+
limitations under the License.
|
README.md
CHANGED
@@ -1,10 +1,70 @@
|
|
1 |
---
|
2 |
-
title: Icelandic
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
sdk: docker
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: Icelandic LLM Leaderboard
|
3 |
+
emoji: 🇮🇸
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: green
|
6 |
sdk: docker
|
7 |
+
hf_oauth: true
|
8 |
+
pinned: true
|
9 |
+
license: apache-2.0
|
10 |
+
tags:
|
11 |
+
- leaderboard
|
12 |
+
- modality:text
|
13 |
+
- submission:automatic
|
14 |
+
- test:public
|
15 |
+
- language:icelandic
|
16 |
+
- eval:language
|
17 |
+
short_description: Track, rank and evaluate LLMs on Icelandic language tasks
|
18 |
---
|
19 |
|
20 |
+
# Icelandic LLM Leaderboard 🇮🇸
|
21 |
+
|
22 |
+
A comprehensive leaderboard for evaluating Large Language Models (LLMs) on Icelandic language tasks. This leaderboard tracks model performance across various Icelandic benchmarks including WinoGrande-IS, GED, Inflection, Belebele-IS, ARC-Challenge-IS, and WikiQA-IS.
|
23 |
+
|
24 |
+
## Features
|
25 |
+
|
26 |
+
- 📊 Interactive table with advanced sorting and filtering
|
27 |
+
- 🔍 Semantic model search with regex support
|
28 |
+
- 📌 Pin models for easy comparison
|
29 |
+
- 📱 Responsive and modern React interface
|
30 |
+
- 🎨 Dark/Light mode support
|
31 |
+
- ⚡️ Optimized performance with virtualization
|
32 |
+
- 🇮🇸 Specialized for Icelandic language evaluation
|
33 |
+
|
34 |
+
## Benchmarks
|
35 |
+
|
36 |
+
### Core Icelandic Tasks
|
37 |
+
- **WinoGrande-IS (3-shot)**: Icelandic common sense reasoning
|
38 |
+
- **GED**: Grammatical error detection in Icelandic
|
39 |
+
- **Inflection (1-shot)**: Icelandic morphological inflection
|
40 |
+
- **Belebele-IS**: Icelandic reading comprehension
|
41 |
+
- **ARC-Challenge-IS**: Icelandic science questions
|
42 |
+
- **WikiQA-IS**: Icelandic question answering
|
43 |
+
|
44 |
+
## Architecture
|
45 |
+
|
46 |
+
The leaderboard uses a modern React frontend with a FastAPI backend, containerized with Docker for seamless deployment on Hugging Face Spaces.
|
47 |
+
|
48 |
+
### Frontend (React)
|
49 |
+
- Material-UI components
|
50 |
+
- TanStack Table for advanced data handling
|
51 |
+
- Real-time filtering and search capabilities
|
52 |
+
|
53 |
+
### Backend (FastAPI)
|
54 |
+
- Integration with Hugging Face repositories
|
55 |
+
- Automatic data synchronization
|
56 |
+
- RESTful API endpoints
|
57 |
+
|
58 |
+
## Data Sources
|
59 |
+
|
60 |
+
The leaderboard pulls evaluation results from:
|
61 |
+
- **Results Repository**: `mideind/icelandic-llm-leaderboard-results`
|
62 |
+
- **Requests Repository**: `mideind/icelandic-llm-leaderboard-requests`
|
63 |
+
|
64 |
+
## Contributing
|
65 |
+
|
66 |
+
To submit a model for evaluation, please follow the submission guidelines in the leaderboard interface.
|
67 |
+
|
68 |
+
## License
|
69 |
+
|
70 |
+
Apache 2.0 License - see LICENSE file for details.
|
backend/app/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
# Icelandic LLM Leaderboard Backend
|
backend/app/__pycache__/__init__.cpython-311.pyc
ADDED
Binary file (218 Bytes). View file
|
|
backend/app/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (117 Bytes). View file
|
|
backend/app/__pycache__/asgi.cpython-312.pyc
ADDED
Binary file (3.52 kB). View file
|
|
backend/app/api/__init__.py
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
from .router import router
|
2 |
+
|
3 |
+
__all__ = ["router"]
|
backend/app/api/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (180 Bytes). View file
|
|
backend/app/api/__pycache__/router.cpython-312.pyc
ADDED
Binary file (380 Bytes). View file
|
|
backend/app/api/endpoints/__init__.py
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
from .leaderboard import router as leaderboard_router
|
2 |
+
|
3 |
+
__all__ = ["leaderboard_router"]
|
backend/app/api/endpoints/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (224 Bytes). View file
|
|
backend/app/api/endpoints/__pycache__/leaderboard.cpython-312.pyc
ADDED
Binary file (3.12 kB). View file
|
|
backend/app/api/endpoints/leaderboard.py
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastapi import APIRouter
|
2 |
+
from typing import List, Dict, Any
|
3 |
+
import logging
|
4 |
+
|
5 |
+
from app.services.leaderboard import IcelandicLeaderboardService
|
6 |
+
from app.core.fastapi_cache import cached, build_cache_key
|
7 |
+
|
8 |
+
logger = logging.getLogger(__name__)
|
9 |
+
router = APIRouter()
|
10 |
+
leaderboard_service = IcelandicLeaderboardService()
|
11 |
+
|
12 |
+
def leaderboard_key_builder(func, namespace: str = "icelandic_leaderboard", **kwargs):
|
13 |
+
"""Build cache key for Icelandic leaderboard data"""
|
14 |
+
key_type = "raw" if func.__name__ == "get_leaderboard" else "formatted"
|
15 |
+
key = build_cache_key(namespace, key_type)
|
16 |
+
logger.debug(f"Built Icelandic leaderboard cache key: {key}")
|
17 |
+
return key
|
18 |
+
|
19 |
+
@router.get("")
|
20 |
+
@cached(expire=300, key_builder=leaderboard_key_builder)
|
21 |
+
async def get_leaderboard() -> List[Dict[str, Any]]:
|
22 |
+
"""
|
23 |
+
Get raw Icelandic leaderboard data
|
24 |
+
Response will be automatically GZIP compressed if size > 500 bytes
|
25 |
+
"""
|
26 |
+
try:
|
27 |
+
logger.info("Fetching raw Icelandic leaderboard data")
|
28 |
+
data = await leaderboard_service.fetch_raw_data()
|
29 |
+
logger.info(f"Retrieved {len(data)} Icelandic leaderboard entries")
|
30 |
+
return data
|
31 |
+
except Exception as e:
|
32 |
+
logger.error(f"Failed to fetch raw Icelandic leaderboard data: {e}")
|
33 |
+
raise
|
34 |
+
|
35 |
+
@router.get("/formatted")
|
36 |
+
@cached(expire=300, key_builder=leaderboard_key_builder)
|
37 |
+
async def get_formatted_leaderboard() -> List[Dict[str, Any]]:
|
38 |
+
"""
|
39 |
+
Get formatted Icelandic leaderboard data with restructured objects
|
40 |
+
Response will be automatically GZIP compressed if size > 500 bytes
|
41 |
+
"""
|
42 |
+
try:
|
43 |
+
logger.info("Fetching formatted Icelandic leaderboard data")
|
44 |
+
data = await leaderboard_service.get_formatted_data()
|
45 |
+
logger.info(f"Retrieved {len(data)} formatted Icelandic entries")
|
46 |
+
return data
|
47 |
+
except Exception as e:
|
48 |
+
logger.error(f"Failed to fetch formatted Icelandic leaderboard data: {e}")
|
49 |
+
raise
|
backend/app/api/router.py
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastapi import APIRouter
|
2 |
+
from app.api.endpoints import leaderboard_router
|
3 |
+
|
4 |
+
router = APIRouter()
|
5 |
+
|
6 |
+
# Include all endpoint routers
|
7 |
+
router.include_router(leaderboard_router, prefix="/leaderboard", tags=["leaderboard"])
|
backend/app/asgi.py
ADDED
@@ -0,0 +1,111 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
ASGI entry point for the Icelandic LLM Leaderboard API.
|
3 |
+
"""
|
4 |
+
import os
|
5 |
+
import logging
|
6 |
+
import logging.config
|
7 |
+
from fastapi import FastAPI
|
8 |
+
from fastapi.middleware.cors import CORSMiddleware
|
9 |
+
from fastapi.middleware.gzip import GZipMiddleware
|
10 |
+
|
11 |
+
from app.api.router import router
|
12 |
+
from app.core.fastapi_cache import setup_cache
|
13 |
+
from app.config import hf_config
|
14 |
+
|
15 |
+
# Configure logging
|
16 |
+
LOGGING_CONFIG = {
|
17 |
+
"version": 1,
|
18 |
+
"disable_existing_loggers": True,
|
19 |
+
"formatters": {
|
20 |
+
"default": {
|
21 |
+
"format": "%(name)s - %(levelname)s - %(message)s",
|
22 |
+
}
|
23 |
+
},
|
24 |
+
"handlers": {
|
25 |
+
"default": {
|
26 |
+
"formatter": "default",
|
27 |
+
"class": "logging.StreamHandler",
|
28 |
+
"stream": "ext://sys.stdout",
|
29 |
+
}
|
30 |
+
},
|
31 |
+
"loggers": {
|
32 |
+
"uvicorn": {
|
33 |
+
"handlers": ["default"],
|
34 |
+
"level": "WARNING",
|
35 |
+
"propagate": False,
|
36 |
+
},
|
37 |
+
"uvicorn.error": {
|
38 |
+
"level": "WARNING",
|
39 |
+
"handlers": ["default"],
|
40 |
+
"propagate": False,
|
41 |
+
},
|
42 |
+
"uvicorn.access": {
|
43 |
+
"handlers": ["default"],
|
44 |
+
"level": "WARNING",
|
45 |
+
"propagate": False,
|
46 |
+
},
|
47 |
+
"app": {
|
48 |
+
"handlers": ["default"],
|
49 |
+
"level": "INFO",
|
50 |
+
"propagate": False,
|
51 |
+
}
|
52 |
+
},
|
53 |
+
"root": {
|
54 |
+
"handlers": ["default"],
|
55 |
+
"level": "INFO",
|
56 |
+
}
|
57 |
+
}
|
58 |
+
|
59 |
+
# Apply logging configuration
|
60 |
+
logging.config.dictConfig(LOGGING_CONFIG)
|
61 |
+
logger = logging.getLogger("app")
|
62 |
+
|
63 |
+
# Create FastAPI application
|
64 |
+
app = FastAPI(
|
65 |
+
title="Icelandic LLM Leaderboard",
|
66 |
+
version="1.0.0",
|
67 |
+
docs_url="/docs",
|
68 |
+
)
|
69 |
+
|
70 |
+
# Add CORS middleware
|
71 |
+
app.add_middleware(
|
72 |
+
CORSMiddleware,
|
73 |
+
allow_origins=["*"],
|
74 |
+
allow_credentials=True,
|
75 |
+
allow_methods=["*"],
|
76 |
+
allow_headers=["*"],
|
77 |
+
)
|
78 |
+
|
79 |
+
# Add GZIP compression
|
80 |
+
app.add_middleware(GZipMiddleware, minimum_size=500)
|
81 |
+
|
82 |
+
# Include API router
|
83 |
+
app.include_router(router, prefix="/api")
|
84 |
+
|
85 |
+
@app.on_event("startup")
|
86 |
+
async def startup_event():
|
87 |
+
"""Initialize services on startup"""
|
88 |
+
logger.info("🇮🇸 ICELANDIC LLM LEADERBOARD STARTING UP")
|
89 |
+
|
90 |
+
# Log HF configuration
|
91 |
+
logger.info(f"Organization: {hf_config.HF_ORGANIZATION}")
|
92 |
+
logger.info(f"Token Status: {'Present' if hf_config.HF_TOKEN else 'Missing'}")
|
93 |
+
logger.info(f"Using repositories:")
|
94 |
+
logger.info(f" - Queue: {hf_config.QUEUE_REPO}")
|
95 |
+
logger.info(f" - Results: {hf_config.RESULTS_REPO}")
|
96 |
+
|
97 |
+
# Setup cache
|
98 |
+
setup_cache()
|
99 |
+
logger.info("FastAPI Cache initialized")
|
100 |
+
|
101 |
+
logger.info("🚀 Icelandic LLM Leaderboard ready!")
|
102 |
+
|
103 |
+
@app.get("/")
|
104 |
+
async def root():
|
105 |
+
"""Root endpoint"""
|
106 |
+
return {"message": "Icelandic LLM Leaderboard API", "status": "running"}
|
107 |
+
|
108 |
+
@app.get("/health")
|
109 |
+
async def health_check():
|
110 |
+
"""Health check endpoint"""
|
111 |
+
return {"status": "healthy"}
|
backend/app/config/__init__.py
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .hf_config import (
|
2 |
+
HF_TOKEN,
|
3 |
+
HF_ORGANIZATION,
|
4 |
+
REPO_ID,
|
5 |
+
QUEUE_REPO,
|
6 |
+
RESULTS_REPO,
|
7 |
+
EVAL_REQUESTS_PATH,
|
8 |
+
EVAL_RESULTS_PATH,
|
9 |
+
API
|
10 |
+
)
|
11 |
+
|
12 |
+
__all__ = [
|
13 |
+
"HF_TOKEN",
|
14 |
+
"HF_ORGANIZATION",
|
15 |
+
"REPO_ID",
|
16 |
+
"QUEUE_REPO",
|
17 |
+
"RESULTS_REPO",
|
18 |
+
"EVAL_REQUESTS_PATH",
|
19 |
+
"EVAL_RESULTS_PATH",
|
20 |
+
"API"
|
21 |
+
]
|
backend/app/config/__pycache__/__init__.cpython-311.pyc
ADDED
Binary file (554 Bytes). View file
|
|
backend/app/config/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (364 Bytes). View file
|
|
backend/app/config/__pycache__/hf_config.cpython-311.pyc
ADDED
Binary file (1.95 kB). View file
|
|
backend/app/config/__pycache__/hf_config.cpython-312.pyc
ADDED
Binary file (1.66 kB). View file
|
|
backend/app/config/hf_config.py
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
from pathlib import Path
|
3 |
+
from huggingface_hub import HfApi
|
4 |
+
|
5 |
+
# Load environment variables from .env file
|
6 |
+
try:
|
7 |
+
from dotenv import load_dotenv
|
8 |
+
# Look for .env file in the project root
|
9 |
+
env_path = Path(__file__).parent.parent.parent / ".env"
|
10 |
+
if env_path.exists():
|
11 |
+
load_dotenv(env_path)
|
12 |
+
print(f"Loaded .env from: {env_path}")
|
13 |
+
else:
|
14 |
+
# Try loading from current directory
|
15 |
+
load_dotenv()
|
16 |
+
print("Loaded .env from current directory")
|
17 |
+
except ImportError:
|
18 |
+
print("python-dotenv not available, using system environment only")
|
19 |
+
|
20 |
+
# Configuration for Icelandic LLM Leaderboard
|
21 |
+
HF_TOKEN = os.environ.get("HF_TOKEN")
|
22 |
+
HF_ORGANIZATION = "mideind"
|
23 |
+
|
24 |
+
# Debug: Print token status (first 10 chars only for security)
|
25 |
+
if HF_TOKEN:
|
26 |
+
print(f"HF_TOKEN loaded: {HF_TOKEN[:10]}...")
|
27 |
+
else:
|
28 |
+
print("HF_TOKEN not found in environment")
|
29 |
+
|
30 |
+
# Repository configuration
|
31 |
+
REPO_ID = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard"
|
32 |
+
QUEUE_REPO = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard-requests"
|
33 |
+
RESULTS_REPO = f"{HF_ORGANIZATION}/icelandic-llm-leaderboard-results"
|
34 |
+
|
35 |
+
# Local cache paths
|
36 |
+
HF_HOME = os.getenv("HF_HOME", ".")
|
37 |
+
EVAL_REQUESTS_PATH = os.path.join(HF_HOME, "eval-queue")
|
38 |
+
EVAL_RESULTS_PATH = os.path.join(HF_HOME, "eval-results")
|
39 |
+
|
40 |
+
# API instance
|
41 |
+
API = HfApi(token=HF_TOKEN)
|
backend/app/core/__pycache__/cache.cpython-311.pyc
ADDED
Binary file (2.46 kB). View file
|
|
backend/app/core/__pycache__/cache.cpython-312.pyc
ADDED
Binary file (2.02 kB). View file
|
|
backend/app/core/__pycache__/fastapi_cache.cpython-312.pyc
ADDED
Binary file (3.35 kB). View file
|
|
backend/app/core/cache.py
ADDED
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
from pathlib import Path
|
3 |
+
from datetime import timedelta
|
4 |
+
|
5 |
+
class CacheConfig:
|
6 |
+
def __init__(self):
|
7 |
+
self.base_path = Path(os.getenv("HF_HOME", "."))
|
8 |
+
self.cache_ttl = timedelta(minutes=5) # 5 minute cache TTL
|
9 |
+
|
10 |
+
def get_cache_path(self, cache_type: str = "datasets") -> Path:
|
11 |
+
"""Get cache path for different cache types"""
|
12 |
+
cache_path = self.base_path / "cache" / cache_type
|
13 |
+
cache_path.mkdir(parents=True, exist_ok=True)
|
14 |
+
return cache_path
|
15 |
+
|
16 |
+
def flush_cache(self, cache_type: str = None):
|
17 |
+
"""Flush specific cache or all caches"""
|
18 |
+
if cache_type:
|
19 |
+
cache_path = self.get_cache_path(cache_type)
|
20 |
+
for file in cache_path.glob("*"):
|
21 |
+
if file.is_file():
|
22 |
+
file.unlink()
|
23 |
+
else:
|
24 |
+
cache_base = self.base_path / "cache"
|
25 |
+
if cache_base.exists():
|
26 |
+
for cache_dir in cache_base.iterdir():
|
27 |
+
if cache_dir.is_dir():
|
28 |
+
for file in cache_dir.glob("*"):
|
29 |
+
if file.is_file():
|
30 |
+
file.unlink()
|
31 |
+
|
32 |
+
# Global cache configuration
|
33 |
+
cache_config = CacheConfig()
|
backend/app/core/fastapi_cache.py
ADDED
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import asyncio
|
2 |
+
import json
|
3 |
+
import hashlib
|
4 |
+
from typing import Any, Callable, Dict, Optional
|
5 |
+
from functools import wraps
|
6 |
+
import logging
|
7 |
+
|
8 |
+
logger = logging.getLogger(__name__)
|
9 |
+
|
10 |
+
# Simple in-memory cache
|
11 |
+
_cache: Dict[str, Any] = {}
|
12 |
+
_cache_lock = asyncio.Lock()
|
13 |
+
|
14 |
+
def build_cache_key(namespace: str, *args, **kwargs) -> str:
|
15 |
+
"""Build a cache key from namespace and parameters"""
|
16 |
+
key_data = f"{namespace}:{args}:{sorted(kwargs.items())}"
|
17 |
+
return hashlib.md5(key_data.encode()).hexdigest()
|
18 |
+
|
19 |
+
def cached(expire: int = 300, key_builder: Optional[Callable] = None):
|
20 |
+
"""
|
21 |
+
Cache decorator for FastAPI endpoints
|
22 |
+
|
23 |
+
Args:
|
24 |
+
expire: Cache expiration time in seconds
|
25 |
+
key_builder: Function to build cache key
|
26 |
+
"""
|
27 |
+
def decorator(func: Callable) -> Callable:
|
28 |
+
@wraps(func)
|
29 |
+
async def wrapper(*args, **kwargs):
|
30 |
+
# Build cache key
|
31 |
+
if key_builder:
|
32 |
+
cache_key = key_builder(func, **kwargs)
|
33 |
+
else:
|
34 |
+
cache_key = build_cache_key(func.__name__, *args, **kwargs)
|
35 |
+
|
36 |
+
# Check cache
|
37 |
+
async with _cache_lock:
|
38 |
+
if cache_key in _cache:
|
39 |
+
cached_data, timestamp = _cache[cache_key]
|
40 |
+
import time
|
41 |
+
if time.time() - timestamp < expire:
|
42 |
+
logger.debug(f"Cache hit for key: {cache_key}")
|
43 |
+
return cached_data
|
44 |
+
else:
|
45 |
+
# Expired, remove from cache
|
46 |
+
del _cache[cache_key]
|
47 |
+
|
48 |
+
# Cache miss, execute function
|
49 |
+
logger.debug(f"Cache miss for key: {cache_key}")
|
50 |
+
result = await func(*args, **kwargs)
|
51 |
+
|
52 |
+
# Store in cache
|
53 |
+
async with _cache_lock:
|
54 |
+
import time
|
55 |
+
_cache[cache_key] = (result, time.time())
|
56 |
+
|
57 |
+
return result
|
58 |
+
return wrapper
|
59 |
+
return decorator
|
60 |
+
|
61 |
+
def setup_cache():
|
62 |
+
"""Setup cache configuration"""
|
63 |
+
logger.info("FastAPI cache initialized with in-memory backend")
|
backend/app/services/__init__.py
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
from .leaderboard import IcelandicLeaderboardService
|
2 |
+
|
3 |
+
__all__ = ["IcelandicLeaderboardService"]
|
backend/app/services/__pycache__/__init__.cpython-311.pyc
ADDED
Binary file (330 Bytes). View file
|
|
backend/app/services/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (220 Bytes). View file
|
|
backend/app/services/__pycache__/leaderboard.cpython-311.pyc
ADDED
Binary file (12.3 kB). View file
|
|
backend/app/services/__pycache__/leaderboard.cpython-312.pyc
ADDED
Binary file (11.3 kB). View file
|
|
backend/app/services/leaderboard.py
ADDED
@@ -0,0 +1,277 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import json
|
3 |
+
import logging
|
4 |
+
from typing import List, Dict, Any
|
5 |
+
from pathlib import Path
|
6 |
+
from huggingface_hub import snapshot_download
|
7 |
+
from fastapi import HTTPException
|
8 |
+
|
9 |
+
from app.config import (
|
10 |
+
QUEUE_REPO,
|
11 |
+
RESULTS_REPO,
|
12 |
+
EVAL_REQUESTS_PATH,
|
13 |
+
EVAL_RESULTS_PATH,
|
14 |
+
HF_TOKEN
|
15 |
+
)
|
16 |
+
from app.core.cache import cache_config
|
17 |
+
|
18 |
+
logger = logging.getLogger(__name__)
|
19 |
+
|
20 |
+
# Import original processing logic
|
21 |
+
import sys
|
22 |
+
import os
|
23 |
+
|
24 |
+
# Add the original Icelandic leaderboard source to Python path
|
25 |
+
original_src_path = os.path.join(os.path.dirname(__file__), '..', '..', 'original_src')
|
26 |
+
if original_src_path not in sys.path:
|
27 |
+
sys.path.insert(0, original_src_path)
|
28 |
+
|
29 |
+
# Also add the parent directory so imports like 'src.display.utils' work
|
30 |
+
backend_path = os.path.join(os.path.dirname(__file__), '..', '..')
|
31 |
+
if backend_path not in sys.path:
|
32 |
+
sys.path.insert(0, backend_path)
|
33 |
+
|
34 |
+
try:
|
35 |
+
from leaderboard.read_evals import get_raw_eval_results
|
36 |
+
from populate import get_leaderboard_df
|
37 |
+
from display.utils import COLS, BENCHMARK_COLS, Tasks
|
38 |
+
except ImportError as e:
|
39 |
+
# Fallback for development without mounted volume
|
40 |
+
logger.warning(f"Could not import original modules: {e}")
|
41 |
+
# Define minimal fallbacks
|
42 |
+
COLS = ["Model", "Average ⬆️", "Type", "Precision", "Architecture", "Hub License", "Hub ❤️", "#Params (B)", "Available on the hub", "Model sha"]
|
43 |
+
BENCHMARK_COLS = ["WinoGrande-IS (3-shot)", "GED", "Inflection (1-shot)", "Belebele (IS)", "ARC-Challenge-IS", "WikiQA-IS"]
|
44 |
+
|
45 |
+
class MockTask:
|
46 |
+
def __init__(self, name, col_name):
|
47 |
+
self.name = name
|
48 |
+
self.col_name = col_name
|
49 |
+
|
50 |
+
class Tasks:
|
51 |
+
task0 = MockTask("winogrande_is", "WinoGrande-IS (3-shot)")
|
52 |
+
task1 = MockTask("ged", "GED")
|
53 |
+
task2 = MockTask("inflection", "Inflection (1-shot)")
|
54 |
+
task5 = MockTask("belebele_is", "Belebele (IS)")
|
55 |
+
task6 = MockTask("arc_challenge_is", "ARC-Challenge-IS")
|
56 |
+
task7 = MockTask("wiki_qa_is", "WikiQA-IS")
|
57 |
+
|
58 |
+
class IcelandicLeaderboardService:
|
59 |
+
def __init__(self):
|
60 |
+
self.results_path = EVAL_RESULTS_PATH
|
61 |
+
self.requests_path = EVAL_REQUESTS_PATH
|
62 |
+
|
63 |
+
async def _ensure_data_available(self):
|
64 |
+
"""Ensure evaluation data is available locally"""
|
65 |
+
try:
|
66 |
+
# Download results if not exists or empty
|
67 |
+
if not os.path.exists(self.results_path) or not os.listdir(self.results_path):
|
68 |
+
logger.info(f"Downloading results to {self.results_path}")
|
69 |
+
snapshot_download(
|
70 |
+
repo_id=RESULTS_REPO,
|
71 |
+
local_dir=self.results_path,
|
72 |
+
repo_type="dataset",
|
73 |
+
token=HF_TOKEN,
|
74 |
+
tqdm_class=None,
|
75 |
+
etag_timeout=30
|
76 |
+
)
|
77 |
+
|
78 |
+
# Download requests if not exists or empty
|
79 |
+
if not os.path.exists(self.requests_path) or not os.listdir(self.requests_path):
|
80 |
+
logger.info(f"Downloading requests to {self.requests_path}")
|
81 |
+
snapshot_download(
|
82 |
+
repo_id=QUEUE_REPO,
|
83 |
+
local_dir=self.requests_path,
|
84 |
+
repo_type="dataset",
|
85 |
+
token=HF_TOKEN,
|
86 |
+
tqdm_class=None,
|
87 |
+
etag_timeout=30
|
88 |
+
)
|
89 |
+
|
90 |
+
except Exception as e:
|
91 |
+
logger.error(f"Failed to download data: {e}")
|
92 |
+
raise HTTPException(status_code=500, detail=f"Failed to download data: {str(e)}")
|
93 |
+
|
94 |
+
async def fetch_raw_data(self) -> List[Dict[str, Any]]:
|
95 |
+
"""Fetch raw leaderboard data using original Icelandic processing logic"""
|
96 |
+
try:
|
97 |
+
await self._ensure_data_available()
|
98 |
+
|
99 |
+
logger.info("Processing Icelandic leaderboard data")
|
100 |
+
|
101 |
+
# Try to use original processing logic if available
|
102 |
+
try:
|
103 |
+
raw_data, df = get_leaderboard_df(
|
104 |
+
self.results_path,
|
105 |
+
self.requests_path,
|
106 |
+
COLS,
|
107 |
+
BENCHMARK_COLS
|
108 |
+
)
|
109 |
+
|
110 |
+
# Convert DataFrame to list of dictionaries
|
111 |
+
data = df.to_dict('records')
|
112 |
+
|
113 |
+
logger.info(f"Processed {len(data)} Icelandic leaderboard entries")
|
114 |
+
return data
|
115 |
+
|
116 |
+
except NameError:
|
117 |
+
# Fallback: return mock data for testing
|
118 |
+
logger.warning("Using mock data - original processing modules not available")
|
119 |
+
return self._generate_mock_data()
|
120 |
+
|
121 |
+
except Exception as e:
|
122 |
+
logger.error(f"Failed to fetch Icelandic leaderboard data: {e}")
|
123 |
+
raise HTTPException(status_code=500, detail=str(e))
|
124 |
+
|
125 |
+
def _generate_mock_data(self) -> List[Dict[str, Any]]:
|
126 |
+
"""Generate mock data for testing when original modules aren't available"""
|
127 |
+
return [
|
128 |
+
{
|
129 |
+
"Model": "test-model/icelandic-gpt-7b",
|
130 |
+
"Average ⬆️": 85.5,
|
131 |
+
"Type": "fine-tuned",
|
132 |
+
"T": "🔶",
|
133 |
+
"Precision": "bfloat16",
|
134 |
+
"Architecture": "LlamaForCausalLM",
|
135 |
+
"Hub License": "apache-2.0",
|
136 |
+
"Hub ❤️": 42,
|
137 |
+
"#Params (B)": 7.0,
|
138 |
+
"Available on the hub": True,
|
139 |
+
"Model sha": "abc123def456",
|
140 |
+
"WinoGrande-IS (3-shot)": 78.5,
|
141 |
+
"GED": 92.3,
|
142 |
+
"Inflection (1-shot)": 85.1,
|
143 |
+
"Belebele (IS)": 80.7,
|
144 |
+
"ARC-Challenge-IS": 76.2,
|
145 |
+
"WikiQA-IS": 89.4
|
146 |
+
},
|
147 |
+
{
|
148 |
+
"Model": "test-model/icelandic-llama-13b",
|
149 |
+
"Average ⬆️": 88.2,
|
150 |
+
"Type": "instruction-tuned",
|
151 |
+
"T": "⭕",
|
152 |
+
"Precision": "float16",
|
153 |
+
"Architecture": "LlamaForCausalLM",
|
154 |
+
"Hub License": "mit",
|
155 |
+
"Hub ❤️": 156,
|
156 |
+
"#Params (B)": 13.0,
|
157 |
+
"Available on the hub": True,
|
158 |
+
"Model sha": "def456abc789",
|
159 |
+
"WinoGrande-IS (3-shot)": 82.1,
|
160 |
+
"GED": 94.8,
|
161 |
+
"Inflection (1-shot)": 87.9,
|
162 |
+
"Belebele (IS)": 85.3,
|
163 |
+
"ARC-Challenge-IS": 79.8,
|
164 |
+
"WikiQA-IS": 91.2
|
165 |
+
}
|
166 |
+
]
|
167 |
+
|
168 |
+
async def get_formatted_data(self) -> List[Dict[str, Any]]:
|
169 |
+
"""Get formatted leaderboard data compatible with React frontend"""
|
170 |
+
try:
|
171 |
+
raw_data = await self.fetch_raw_data()
|
172 |
+
formatted_data = []
|
173 |
+
|
174 |
+
for item in raw_data:
|
175 |
+
try:
|
176 |
+
formatted_item = await self.transform_data(item)
|
177 |
+
formatted_data.append(formatted_item)
|
178 |
+
except Exception as e:
|
179 |
+
logger.error(f"Failed to format entry: {e}")
|
180 |
+
continue
|
181 |
+
|
182 |
+
logger.info(f"Formatted {len(formatted_data)} entries for frontend")
|
183 |
+
return formatted_data
|
184 |
+
|
185 |
+
except Exception as e:
|
186 |
+
logger.error(f"Failed to format leaderboard data: {e}")
|
187 |
+
raise HTTPException(status_code=500, detail=str(e))
|
188 |
+
|
189 |
+
async def transform_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
|
190 |
+
"""Transform Icelandic leaderboard data into format expected by React frontend"""
|
191 |
+
|
192 |
+
# Create unique ID and clean model name
|
193 |
+
raw_model_name = data.get("Model", "Unknown")
|
194 |
+
|
195 |
+
# Extract clean model name from HTML if present
|
196 |
+
if '<a target="_blank" href=' in raw_model_name:
|
197 |
+
# Parse HTML to extract clean model name
|
198 |
+
import re
|
199 |
+
match = re.search(r'>([^<]+)</a>', raw_model_name)
|
200 |
+
model_name = match.group(1) if match else raw_model_name
|
201 |
+
else:
|
202 |
+
model_name = raw_model_name
|
203 |
+
|
204 |
+
precision = data.get("Precision", "Unknown")
|
205 |
+
revision = data.get("Model sha", "Unknown")
|
206 |
+
unique_id = f"{model_name}_{precision}_{revision}"
|
207 |
+
|
208 |
+
# Map Icelandic tasks to evaluations format
|
209 |
+
evaluations = {}
|
210 |
+
task_mapping = {
|
211 |
+
"WinoGrande-IS (3-shot)": "winogrande_is",
|
212 |
+
"GED": "ged",
|
213 |
+
"Inflection (1-shot)": "inflection",
|
214 |
+
"Belebele (IS)": "belebele_is",
|
215 |
+
"ARC-Challenge-IS": "arc_challenge_is",
|
216 |
+
"WikiQA-IS": "wiki_qa_is"
|
217 |
+
}
|
218 |
+
|
219 |
+
for task_display_name, task_key in task_mapping.items():
|
220 |
+
if task_display_name in data:
|
221 |
+
evaluations[task_key] = {
|
222 |
+
"name": task_display_name,
|
223 |
+
"value": data.get(task_display_name, 0),
|
224 |
+
"normalized_score": data.get(task_display_name, 0)
|
225 |
+
}
|
226 |
+
|
227 |
+
# Extract model type and clean it
|
228 |
+
model_type_symbol = data.get("T", "")
|
229 |
+
model_type_name = data.get("Type", "Unknown")
|
230 |
+
|
231 |
+
# Map Icelandic model types to frontend format
|
232 |
+
type_mapping = {
|
233 |
+
"pretrained": "pretrained",
|
234 |
+
"fine-tuned": "fine-tuned",
|
235 |
+
"instruction-tuned": "instruction-tuned",
|
236 |
+
"RL-tuned": "RL-tuned"
|
237 |
+
}
|
238 |
+
|
239 |
+
clean_model_type = type_mapping.get(model_type_name, model_type_name)
|
240 |
+
|
241 |
+
features = {
|
242 |
+
"is_not_available_on_hub": not data.get("Available on the hub", True),
|
243 |
+
"is_merged": False, # Not tracked in Icelandic leaderboard
|
244 |
+
"is_moe": False, # Not tracked in Icelandic leaderboard
|
245 |
+
"is_flagged": False, # Not tracked in Icelandic leaderboard
|
246 |
+
"is_official_provider": False # Not tracked in Icelandic leaderboard
|
247 |
+
}
|
248 |
+
|
249 |
+
metadata = {
|
250 |
+
"upload_date": None, # Not available in Icelandic data
|
251 |
+
"submission_date": None, # Not available in Icelandic data
|
252 |
+
"generation": None, # Not available in Icelandic data
|
253 |
+
"base_model": None, # Not available in Icelandic data
|
254 |
+
"hub_license": data.get("Hub License", ""),
|
255 |
+
"hub_hearts": data.get("Hub ❤️", 0),
|
256 |
+
"params_billions": data.get("#Params (B)", 0),
|
257 |
+
"co2_cost": 0 # Not tracked in Icelandic leaderboard
|
258 |
+
}
|
259 |
+
|
260 |
+
transformed_data = {
|
261 |
+
"id": unique_id,
|
262 |
+
"model": {
|
263 |
+
"name": model_name,
|
264 |
+
"sha": revision,
|
265 |
+
"precision": precision,
|
266 |
+
"type": clean_model_type,
|
267 |
+
"weight_type": None, # Not available in Icelandic data
|
268 |
+
"architecture": data.get("Architecture", "Unknown"),
|
269 |
+
"average_score": data.get("Average ⬆️", 0),
|
270 |
+
"has_chat_template": False # Not tracked in Icelandic leaderboard
|
271 |
+
},
|
272 |
+
"evaluations": evaluations,
|
273 |
+
"features": features,
|
274 |
+
"metadata": metadata
|
275 |
+
}
|
276 |
+
|
277 |
+
return transformed_data
|
backend/original_src/__pycache__/about.cpython-310.pyc
ADDED
Binary file (4.26 kB). View file
|
|
backend/original_src/__pycache__/about.cpython-311.pyc
ADDED
Binary file (4.76 kB). View file
|
|
backend/original_src/__pycache__/about.cpython-312.pyc
ADDED
Binary file (4.49 kB). View file
|
|
backend/original_src/__pycache__/envs.cpython-310.pyc
ADDED
Binary file (777 Bytes). View file
|
|
backend/original_src/__pycache__/envs.cpython-312.pyc
ADDED
Binary file (1.19 kB). View file
|
|
backend/original_src/__pycache__/populate.cpython-310.pyc
ADDED
Binary file (1.06 kB). View file
|
|
backend/original_src/__pycache__/populate.cpython-311.pyc
ADDED
Binary file (1.6 kB). View file
|
|
backend/original_src/__pycache__/populate.cpython-312.pyc
ADDED
Binary file (1.33 kB). View file
|
|
backend/original_src/__pycache__/populate.cpython-38.pyc
ADDED
Binary file (1.04 kB). View file
|
|
backend/original_src/about.py
ADDED
@@ -0,0 +1,75 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from dataclasses import dataclass
|
2 |
+
from enum import Enum
|
3 |
+
|
4 |
+
@dataclass
|
5 |
+
class Task:
|
6 |
+
benchmark: str
|
7 |
+
metric: str
|
8 |
+
col_name: str
|
9 |
+
|
10 |
+
|
11 |
+
# Select your tasks here
|
12 |
+
# ---------------------------------------------------
|
13 |
+
class Tasks(Enum):
|
14 |
+
# task_key in the json file, metric_key in the json file, name to display in the leaderboard
|
15 |
+
task0 = Task("icelandic_winogrande_stringmatch", "exact_match,get-answer", "WinoGrande-IS (3-shot)")
|
16 |
+
task1 = Task("icelandic_sentences_ged_stringmatch", "exact_match,get-answer", "GED")
|
17 |
+
task2 = Task("icelandic_inflection_all", "exact_match,get-answer", "Inflection (1-shot)")
|
18 |
+
task5 = Task("icelandic_belebele", "exact_match,get-answer", "Belebele (IS)")
|
19 |
+
task6 = Task("icelandic_arc_challenge", "exact_match,get-answer", "ARC-Challenge-IS")
|
20 |
+
task7 = Task("icelandic_wiki_qa", "lm_judge_score,get-answer", "WikiQA-IS")
|
21 |
+
|
22 |
+
# ---------------------------------------------------
|
23 |
+
|
24 |
+
|
25 |
+
|
26 |
+
# Your leaderboard name
|
27 |
+
TITLE = """<h1 align="center" id="space-title">Icelandic LLM leaderboard</h1>"""
|
28 |
+
|
29 |
+
# What does your leaderboard evaluate?
|
30 |
+
INTRODUCTION_TEXT = """
|
31 |
+
"""
|
32 |
+
|
33 |
+
# Which evaluations are you running? how can people reproduce what you have?
|
34 |
+
LLM_BENCHMARKS_TEXT = f"""
|
35 |
+
## New submissions
|
36 |
+
Do you want your model to be included on the leaderboard? Open a discussion on this repository with the details of your model and we will get back to you.
|
37 |
+
|
38 |
+
## Benchmark tasks
|
39 |
+
The Icelandic LLM leaderboard evaluates models on several tasks. All of them are set up as generation tasks, where the model's output is compared to the expected output.
|
40 |
+
This means that models that have not been instruction fine-tuned might perform poorly on these tasks.
|
41 |
+
|
42 |
+
The following tasks are evaluated:
|
43 |
+
|
44 |
+
### WinoGrande-IS
|
45 |
+
The Icelandic WinoGrande task is a human-translated and localized version of the ~1000 test set examples in the WinoGrande task in English.
|
46 |
+
Each example consists of a sentence with a blank, and two answer choices for the blank. The task is to choose the correct answer choice using coreference resolution.
|
47 |
+
The benchmark is designed to test the model's ability to use knowledge and common sense reasoning in Icelandic. For this benchmark, we use 3-shot evaluation.
|
48 |
+
The Icelandic WinoGrande dataset is described in more detail in the IceBERT paper (https://aclanthology.org/2022.lrec-1.464.pdf).
|
49 |
+
- Link to dataset: https://huggingface.co/datasets/mideind/icelandic-winogrande
|
50 |
+
|
51 |
+
### GED
|
52 |
+
This is a benchmark for binary sentence-level Icelandic grammatical error detection, adapted from the Icelandic Error Corpus (IEC) and contains 200 examples.
|
53 |
+
Each example consists of a sentence that may contain one or more grammatical errors, and the task is to predict whether the sentence contains an error.
|
54 |
+
- Link to dataset: https://huggingface.co/datasets/mideind/icelandic-sentences-gec
|
55 |
+
|
56 |
+
### Inflection benchmark
|
57 |
+
The inflection benchmark tests models' ability to generate inflected forms of 300 Icelandic adjective-noun pairs for all four cases, singular and plural.
|
58 |
+
- Link to dataset: https://huggingface.co/datasets/mideind/icelandic-inflection-all-flat
|
59 |
+
|
60 |
+
### Belebele (IS)
|
61 |
+
This is the Icelandic subset (900 examples) of the Belebele benchmark, a multiple-choice reading comprehension task. The task is to answer questions about a given passage.
|
62 |
+
- Link to dataset: https://huggingface.co/datasets/facebook/belebele
|
63 |
+
|
64 |
+
### ARC-Challenge-IS
|
65 |
+
A machine-translated version of the ARC-Challenge multiple-choice question-answering dataset. For this benchmark, we use the test set which contains 1.23k examples.
|
66 |
+
- Link to dataset: https://huggingface.co/datasets/mideind/icelandic-arc-challenge
|
67 |
+
|
68 |
+
### WikiQA-IS
|
69 |
+
The Icelandic WikiQA dataset is a collection of 1.9k question-answer pairs from the Icelandic Wikipedia, meant to evaluate models' knowledge of Icelandic culture and history.
|
70 |
+
They were collected by making GPT-4o generate questions and anwswers
|
71 |
+
given Icelandic Wikipedia articles as context. All examples were then manually verified and corrected where necessary. For evaluation, we prompt GPT-4o to
|
72 |
+
compare the generated answer to the original answer for semantic similarity and rate the answer on the following scale: (0, "poor"), (1, "fair"), (2, "excellent").
|
73 |
+
- Link to dataset: https://huggingface.co/datasets/mideind/icelandic_wiki_qa
|
74 |
+
"""
|
75 |
+
|
backend/original_src/display/__pycache__/css_html_js.cpython-310.pyc
ADDED
Binary file (1.94 kB). View file
|
|
backend/original_src/display/__pycache__/css_html_js.cpython-311.pyc
ADDED
Binary file (1.96 kB). View file
|
|
backend/original_src/display/__pycache__/css_html_js.cpython-312.pyc
ADDED
Binary file (1.95 kB). View file
|
|
backend/original_src/display/__pycache__/formatting.cpython-310.pyc
ADDED
Binary file (1.44 kB). View file
|
|
backend/original_src/display/__pycache__/formatting.cpython-311.pyc
ADDED
Binary file (2.01 kB). View file
|
|
backend/original_src/display/__pycache__/formatting.cpython-312.pyc
ADDED
Binary file (1.8 kB). View file
|
|