VLMEvalKit Evaluation Results Collection
Ranking of LLMs for agentic tasks
Kontext image editing on FLUX[dev]