πŸš’ Firefighter GridWorld Leaderboard

A reinforcement learning benchmark in a 4x4 grid world where the agent must:

  1. Pick up a water bucket πŸ’§
  2. Extinguish a fire πŸ”₯
  3. Reach the goal 🏁

The environment features deterministic and stochastic versions with discrete actions, rewards, penalties, and sprite-based rendering.


πŸ“Š Leaderboard (300 Episodes)

Rank Model Mean Reward Std Dev Success Rate Notes
πŸ₯‡ 1 MCTS 27.4 3.64 1.00 50 simulations, random rollout
2 PPO 4.0 5.83 ~0.40 Trained with Stable-Baselines3
3 DQN –30.0 23.9 ❌ Failed task consistently

πŸ§ͺ Evaluation Protocol

  • Each agent is evaluated over 300 episodes

  • Maximum steps per episode: 60

  • Environment starts with the robot in the top-left

  • Rewards:

    • +10: extinguish fire and reach goal
    • –1: step penalty
    • –5: invalid actions or skipping steps

πŸ›  Setup

pip install -r requirements.txt

πŸš€ Evaluate Your Agent

  1. Clone the repo:
git clone https://huggingface.co/spaces/YOUR_USERNAME/firefighter-gridworld-leaderboard
cd firefighter-gridworld-leaderboard
  1. Run evaluation:
python evaluation/evaluate_custom_agent.py --path ./my_agent.zip --algo PPO
  1. Submit your eval_results.json via Pull Request.

🧠 Environment API

Custom environment follows Gymnasium standards:

import gymnasium as gym
from env.firefighter_env import FireFighterEnv

env = FireFighterEnv()
obs, info = env.reset()
for _ in range(60):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        break

πŸ“₯ Submissions

Include in your Pull Request:

  • eval_results.json
  • Description of your model and training setup
  • GIF of successful episode (optional)

πŸ“¦ Files

  • env/ – environment code
  • agents/ – training scripts (PPO, DQN, MCTS)
  • evaluation/ – evaluation and rendering
  • models/ – saved agents
  • assets/ – sprites and animation

πŸ“œ License

MIT License. Contributions welcome!

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading