Spaces:
Running
on
CPU Upgrade
dummy column refactoring
Changes in app.py
- Search Model Function: Updated to search within
AutoEvalColumn.model.name
instead ofAutoEvalColumn.dummy.name
. - Select Columns Function: Removed references to
AutoEvalColumn.dummy.name
and adjusted to only include necessary columns. - Dataframe Component in Gradio Interface: Updated to ensure that the table no longer includes
AutoEvalColumn.dummy.name
.
Changes in css_html_js.py
- CSS Rules: Removed CSS rules that hid the last column (which was the
dummy
column).
Changes in utils.py
- Column Definitions: The main thing – refactored the way columns are defined and initialized. Removed the initialization that directly added the
dummy
column. - Dataclass Initialization: Updated to create the
AutoEvalColumn
dataclass without thedummy
column.
Changes in filter_models.py
- Flag Models Function: Modified to use
AutoEvalColumn.model.name
for flagging logic instead of using a separate "model_name_for_query" which was tied to thedummy
column. - Remove Forbidden Models Function: Updated to check against
AutoEvalColumn.model.name
instead of a non-existent "model_name_for_query".
Changes in read_evals.py
- EvalResult to_dict Method: Removed the line that added the
dummy
column's data to the dictionary. Now it only uses relevant and existing columns.
Changes in collections.py
- Update Collections Function: Updated the sorting and selection logic to use
AutoEvalColumn.model.name
instead ofAutoEvalColumn.dummy.name
.
Functionality
I have tested that the search and buttons are working correctly. I'll need your review @clefourrier :3
Hello!
Looking good!
Some specific comments to address before merging:
src/display/utils.py
- Doesn't your new system reorder the columns in the table?src/leaderboard/filter_models.py
. You remove
# Merges and moes are flagged automatically
if model_data[AutoEvalColumn.flagged.name]:
flag_key = "merged"
which means that some models (relying on the tags and dynamic data for flagging) will no longer be flagged I think.
- appends in
src/display/utils.py
are my long-standing problem, each time I'm trying to make them more readable. You're right, this idea reorder the column in the table, let revert this change. - my bad with
src/leaderboard/filter_models.py
😱
Besides, I see, if we don't use this dummy column so we don't have a simple model name, and AutoEvalColumn.model.name
contains specs, for example, this one:model check: <a target="_blank" href="https://huggingface.co/ceadar-ie/FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">ceadar-ie/FinanceConnect-13B</a> <a target="_blank" href="https://huggingface.co/datasets/open-llm-leaderboard/details_ceadar-ie__FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">📑</a>
And when I removed the dummy column and tried to filter by AutoEvalColumn.model.name
, the flags filter breaks. It's manageable, let me think.
So, the dummy column appeared to be a useful one, but let's rename it to "fullname", it's a little bit more clear. Three thoughts:
- I should've called this PR "There and Back Again", but we found out the model name is not a simple model name.
- We need to refactor
read_evals.py
because there're lots data transformation right now, we need to simplify it. - We need to do smth with model name fields, it's currently a mess.
PR is now merged/closed. The ephemeral Space has been deleted.
(This is an automated message.)