Spaces:
Running
Running
| #### \[EN\] Guide for Input .jsonl Files | |
| If you have five models to compare, upload five .jsonl files. | |
| * ๐ฅAll `.jsonl` files must have the same number of rows. | |
| * ๐ฅThe `model_id` field must be different for each file and unique within each file. | |
| * ๐ฅEach `.jsonl` file should have different `generated`, `model_id` from the other files. `instruction`, `task` should be the same. | |
| **Required `.jsonl` Fields** | |
| * Reserved Fields (Mandatory) | |
| * `model_id`: The name of the model being evaluated. (Recommended to be short) | |
| * `instruction`: The instruction given to the model. This corresponds to the test set prompt (not the evaluation prompt). | |
| * `generated`: Enter the response generated by the model for the test set instruction. | |
| * `task`: Used to group and display overall results as a subset. Can be utilized when you want to use different evaluation prompts per row. | |
| * Additional | |
| * Depending on the evaluation prompt you use, you can utilize other additional fields. You can freely add them to your `.jsonl` files, avoiding the keywords | |
| mentioned above. | |
| * Example: For `translation_pair.yaml` and `translation_fortunecookie.yaml` prompts, the `source_lang` and `target_lang` fields are read from the `.jsonl` and | |
| utilized. | |
| For example, when evaluating with the `translation_pair` prompt, each .jsonl file looks like this: | |
| ```python | |
| # model1.jsonl | |
| {"model_id": "๋ชจ๋ธ1", "task": "์ํ", "instruction": "์ด๋๋ก ๊ฐ์ผํ์ค", "generated": "Where should I go", "source_lang": "Korean", "target_lang": "English"} | |
| {"model_id": "๋ชจ๋ธ1", "task": "ํ์", "instruction": "1+1?", "generated": "1+1?", "source_lang": "English", "target_lang": "Korean"} | |
| # model2.jsonl -* model1.jsonl๊ณผ `instruction`์ ๊ฐ๊ณ `generated`, `model_id` ๋ ๋ค๋ฆ ๋๋ค! | |
| {"model_id": "๋ชจ๋ธ2", "task": "์ํ", "instruction": "์ด๋๋ก ๊ฐ์ผํ์ค", "generated": "๊ธ์๋ค", "source_lang": "Korean", "target_lang": "English"} | |
| {"model_id": "๋ชจ๋ธ2", "task": "ํ์", "instruction": "1+1?", "generated": "2", "source_lang": "English", "target_lang": "Korean"} | |
| ... | |
| .. | |
| ``` | |
| On the other hand, when evaluating with the `llmbar` prompt, fields like source_lang and target_lang are not used, similar to translation evaluation, and naturally, you don't need to add them to your .jsonl. | |