Open LLM Leaderboard org
No description provided.
Open LLM Leaderboard org
edited Apr 8

The new search logic, mainly the updated filter_queries() function in app.py file:
– The function filters a pandas df based on model name and license queries provided in a single string, separated by semicolons (;).
– It splits the input string into individual queries, treating those prefixed with "license:" specially for license searches, and the rest for model name searches.
– For model name searches, it accumulates results from a search_table() function call into a list, if any matches are found.
– If no model name matches are found and there are no license queries, it returns an empty df with the same structure as the input.
– Results from model name searches are combined and duplicates removed based on certain columns.
– License queries are processed separately, using a search_license() function, with results similarly accumulated, combined, and deduplicated.
– If no results are found for license queries, it returns an empty df. Otherwise, it returns the combined and deduplicated results.
– The function ultimately returns a df filtered according to the queries, with an emphasis on handling empty queries or no matches by returning either the original df or an empty one.

Additional:
– Applied make style.
– Refactored flag_models () in filter_models.py to avoid equality comparisons to True.

Open LLM Leaderboard org
edited Apr 8

Added a conditional logic to control the initialization process of various df objects based on the SKIP_INIT environment variable.

The idea: when SKIP_INIT is set to True, a minimal initialization routine is executed by setting the full_init parameter of init_space to False. This allows faster setup when a full initialisation is not required, so mainly implemented for debugging.

Usage: SKIP_INIT=True poetry run python app.py

Open LLM Leaderboard org

Returned enable_space_ci import in app.py, might use it later

Open LLM Leaderboard org

pyproject.toml contains authors indication. According to the Poetry documentation (https://python-poetry.org/docs/pyproject/) – "This is a list of authors and should contain at least one author." Feel free to complete it the way it should be.

alozowski changed pull request status to closed
alozowski changed pull request status to open
Open LLM Leaderboard org
edited Apr 9

There is apparently no way to do a code review cleanly, so I'll have to give you line numbers etc.

General comments:

  • please do not again combine a coding style PR with a normal PR. Super cool that you applied and updated our coding style tools, it was needed, but it makes your PR considerably harder to review.
  • could you re-enable the space ci? it will allow us to test your modifs in a live space (= uncommenting line 51 of app.py should do the trick)

Code style comments:

  • Avoid single letter names (_q) unless it's 1) the index of an iterable (for i in range... or for i, item in enumerate(...) 2) in a list comprehension.
  • in the pyproject.toml, let's remove the author field entirely, since the leaderboard is not intended to be used/installed as a package, and is regularly duplicated by the community who might not remove/edit the field. However, you can edit the citation field to add yourself as one of its authors now, since you'll be working on it.

Specific about your editions in app.py

  • line 118 to 122 - full_init=(not skip_init) is not very legible - better to change the name of the env variable that you introduce to "DO_FULL_INIT" the following way. More generally, better use a cast to bool than a string eq which will be less legible.
do_full_init = bool(os.getenv("LEADERBOARD_FULL_INIT", "True"))

leaderboard_df, original_df, plot_df, finished_eval_queue_df, running_eval_queue_df, pending_eval_queue_df = (
    init_space(full_init=do_full_init)
)
  • function filter_queries - you could make this code faster and more understandable by using list comprehensions.
   ...
    all_queries = [q.strip() for q in query.split(";") if q.strip != ""]
    model_queries = [q for q in all_queries if not q.startswith("licence")]
    license_queries_raw = [q  for q in all_queries if q.startwith("license")]
    license_queries = [q.replace("license:", "").strip() for q in license_queries_raw if q.replace("license:", "").strip()  != ""]

    # Handling model name search
    for query in model_queries:
            tmp_df = search_table(df, query)
            if len(tmp_df) > 0:
                tmp_result_df.append(tmp_df)

    # Handling license search
    for query in license_queries:
            tmp_df = search_license(df, query)
            if len(tmp_df) > 0:
                tmp_result_df.append(tmp_df)

...

Side note, it could have been interesting to allow the user to search without having to specify license:, which will also make the code more legible: you look for both the model names and the license in both columns.
We also want to keep the current logic: if nothing is found through the search, we return the full table (but I'm OK with discussing this).
To fit the new naming convention (search_license), it would make sense to rename search_table to search_model)

Open LLM Leaderboard org

Thank you for your comments @clefourrier !

Sorry for uploading code changes and style formating in one commnit, didn't intend to do so :(

Your comments are super helpful, refactored according to them and kept two things:

  1. I'm not sure about removing the "license:" tag, what if we remove that tag and the model name is partly the same as the licence name? Let's keep it like this for a while

  2. The idea to return an empty DataFrame when nothing was found clearly shows a failed search query. If this idea proves to be a bad one, we could consider implementing an indication of a failed search instead of returning an empty DataFrame

Open LLM Leaderboard org

It looks good to me - sorry that you'll have some merge conflicts to manage ^^".

Open LLM Leaderboard org

That's fine, I guess this conflict is an easy one to solve

alozowski changed pull request status to merged
Open LLM Leaderboard org

Solved the conflict and merged, should be fine

Sign up or log in to comment