BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks
•
49
None defined yet.
Evaluate code samples using specified parameters
Evaluate code samples and get results
Explore and analyze code evaluation data
Display PDF Document
Search and submit code models for evaluation
Check if your GitHub repositories are in The Stack dataset
Search code snippets in StarCoder dataset
Generate code solutions to mathematical and logical problems
Start a web app server
Generate code snippets in Python, Java, JavaScript
Display interactive Bokeh plot