Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

Search Model Function: Updated to search within AutoEvalColumn.model.name instead of AutoEvalColumn.dummy.name.
Select Columns Function: Removed references to AutoEvalColumn.dummy.name and adjusted to only include necessary columns.
Dataframe Component in Gradio Interface: Updated to ensure that the table no longer includes AutoEvalColumn.dummy.name.

Changes in `css_html_js.py`

CSS Rules: Removed CSS rules that hid the last column (which was the dummy column).

Changes in `utils.py`

Column Definitions: The main thing – refactored the way columns are defined and initialized. Removed the initialization that directly added the dummy column.
Dataclass Initialization: Updated to create the AutoEvalColumn dataclass without the dummy column.

Changes in `filter_models.py`

Flag Models Function: Modified to use AutoEvalColumn.model.name for flagging logic instead of using a separate "model_name_for_query" which was tied to the dummy column.
Remove Forbidden Models Function: Updated to check against AutoEvalColumn.model.name instead of a non-existent "model_name_for_query".

Changes in `read_evals.py`

EvalResult to_dict Method: Removed the line that added the dummy column's data to the dictionary. Now it only uses relevant and existing columns.

Changes in `collections.py`

Update Collections Function: Updated the sorting and selection logic to use AutoEvalColumn.model.name instead of AutoEvalColumn.dummy.name.

Functionality

I have tested that the search and buttons are working correctly. I'll need your review @clefourrier :3

clefourrier

Open LLM Leaderboard org Apr 23, 2024

Hello!

Looking good!

Some specific comments to address before merging:

src/display/utils.py - Doesn't your new system reorder the columns in the table?
src/leaderboard/filter_models.py. You remove

# Merges and moes are flagged automatically
        if model_data[AutoEvalColumn.flagged.name]:
            flag_key = "merged"

which means that some models (relying on the tags and dynamic data for flagging) will no longer be flagged I think.

alozowski

Open LLM Leaderboard org Apr 23, 2024

appends in src/display/utils.py are my long-standing problem, each time I'm trying to make them more readable. You're right, this idea reorder the column in the table, let revert this change.
my bad with src/leaderboard/filter_models.py 😱

Besides, I see, if we don't use this dummy column so we don't have a simple model name, and AutoEvalColumn.model.name contains specs, for example, this one:
model check: <a target="_blank" href="https://huggingface.co/ceadar-ie/FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">ceadar-ie/FinanceConnect-13B</a> <a target="_blank" href="https://huggingface.co/datasets/open-llm-leaderboard/details_ceadar-ie__FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">📑</a>

And when I removed the dummy column and tried to filter by AutoEvalColumn.model.name, the flags filter breaks. It's manageable, let me think.

enhanced naming of dummy columnbab5ced1

alozowski

Open LLM Leaderboard org Apr 23, 2024

So, the dummy column appeared to be a useful one, but let's rename it to "fullname", it's a little bit more clear. Three thoughts:

I should've called this PR "There and Back Again", but we found out the model name is not a simple model name.
We need to refactor read_evals.py because there're lots data transformation right now, we need to simplify it.
We need to do smth with model name fields, it's currently a mess.

alozowski changed pull request status to open Apr 23, 2024

alozowski changed pull request title from dummy column removal to dummy column refactoring Apr 23, 2024

alozowski changed pull request status to merged Apr 23, 2024

clefourrier

Open LLM Leaderboard org Apr 23, 2024

PR is now merged/closed. The ephemeral Space has been deleted.
(This is an automated message.)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

dummy column refactoring

Changes in app.py

Changes in css_html_js.py

Changes in utils.py

Changes in filter_models.py

Changes in read_evals.py

Changes in collections.py

Functionality

Changes in `app.py`

Changes in `css_html_js.py`

Changes in `utils.py`

Changes in `filter_models.py`

Changes in `read_evals.py`

Changes in `collections.py`