🚒 Firefighter GridWorld Leaderboard

A reinforcement learning benchmark in a 4x4 grid world where the agent must:

Pick up a water bucket 💧
Extinguish a fire 🔥
Reach the goal 🏁

The environment features deterministic and stochastic versions with discrete actions, rewards, penalties, and sprite-based rendering.

📊 Leaderboard (300 Episodes)

Rank	Model	Mean Reward	Std Dev	Success Rate	Notes
🥇 1	MCTS	27.4	3.64	1.00	50 simulations, random rollout
2	PPO	4.0	5.83	~0.40	Trained with Stable-Baselines3
3	DQN	–30.0	23.9	❌	Failed task consistently

🧪 Evaluation Protocol

Each agent is evaluated over 300 episodes
Maximum steps per episode: 60
Environment starts with the robot in the top-left
Rewards:
- +10: extinguish fire and reach goal
- –1: step penalty
- –5: invalid actions or skipping steps

🛠 Setup

pip install -r requirements.txt

🚀 Evaluate Your Agent

Clone the repo:

git clone https://huggingface.co/spaces/YOUR_USERNAME/firefighter-gridworld-leaderboard
cd firefighter-gridworld-leaderboard

Run evaluation:

python evaluation/evaluate_custom_agent.py --path ./my_agent.zip --algo PPO

Submit your eval_results.json via Pull Request.

🧠 Environment API

Custom environment follows Gymnasium standards:

import gymnasium as gym
from env.firefighter_env import FireFighterEnv

env = FireFighterEnv()
obs, info = env.reset()
for _ in range(60):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break

📥 Submissions

Include in your Pull Request:

eval_results.json
Description of your model and training setup
GIF of successful episode (optional)

📦 Files

env/ – environment code
agents/ – training scripts (PPO, DQN, MCTS)
evaluation/ – evaluation and rendering
models/ – saved agents
assets/ – sprites and animation

📜 License

MIT License. Contributions welcome!