Spaces:

NCSOFT
/

ArenaLite

Running

App Files Files Community

ArenaLite / guide_mds /input_jsonls_en.md

sonsus

others

5b51c97 11 months ago

preview code

raw

history blame

1.74 kB

[EN] Upload guide (`jsonl`)

Basic Requirements

Upload one jsonl file per model (e.g., five files to compare five LLMs)
⚠️ Important: All jsonl files must have the same number of rows
⚠️ Important: The model_id field must be unique within and across all files

Required Fields

Per Model Fields
- model_id: Unique identifier for the model (recommendation: keep it short)
- generated: The LLM's response to the test instruction
Required only for Translation (translation_pair prompt need those. See streamlit_app_local/user_submit/mt/llama5.jsonl)
- source_lang: input language (e.g. Korean, KR, kor, ...)
- target_lang: output language (e.g. English, EN, ...)
Common Fields (Must be identical across all files)
- instruction: The input prompt or test instruction given to the model
- task: Category label used to group results (useful when using different evaluation prompts per task)

Example Format

# model1.jsonl
{"model_id": "model1", "task": "directions", "instruction": "Where should I go?", "generated": "Over there"}
{"model_id": "model1", "task": "arithmetic", "instruction": "1+1", "generated": "2"}

# model2.jsonl
{"model_id": "model2", "task": "directions", "instruction": "Where should I go?", "generated": "Head north"}
{"model_id": "model2", "task": "arithmetic", "instruction": "1+1", "generated": "3"}
...
..
.

Use Case Example If you want to compare different prompting strategies for the same model:

Use the same instruction across files (using unified test scenarios).
generated responses of each prompting strategy will vary across the files.
Use descriptive model_id values like "prompt1", "prompt2", etc.

[EN] Upload guide (jsonl)

[EN] Upload guide (`jsonl`)