AlphaApple: FruitBox Game AI Agent
Model Description
μ΄ λͺ¨λΈμ νκ΅μ μ¬κ³Όκ²μ(FruitBox) νΌμ¦μ ν΄κ²°νλ AI μμ΄μ νΈμ λλ€. 10Γ17 격μμμ ν©μ΄ 10μΈ μ§μ¬κ°νμ μ°Ύμ μ κ±°νλ κ²μμ PPO(Proximal Policy Optimization) μκ³ λ¦¬μ¦μΌλ‘ νμ΅νμ΅λλ€.
Game Rules
- 10Γ17 격μ, κ° μ μ 1-9 μ«μ
- μ§μ¬κ°ν μμμ μ νν΄μ μ«μ ν©μ΄ μ νν 10μ΄λ©΄ ν΄λΉ μμ μ κ±°
- μ κ±°λ μ κ°μλ§νΌ μ μ νλ
- λ μ΄μ μ κ±°ν μ μλ μμμ΄ μμΌλ©΄ κ²μ μ’ λ£
Performance
Agent | Average Score | Improvement |
---|---|---|
Random | 71.9 | - |
Greedy | 73.3 | +1.9% |
PPO | 77.0 | +7.1% |
Usage
Python (PyTorch)
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv
# Load model
model = PPO.load("pytorch_model.zip")
# Use for inference
obs = env.reset()
action, _ = model.predict(obs)
Web/JavaScript (ONNX)
import { InferenceSession } from 'onnxruntime-web';
// Load ONNX model
const session = await InferenceSession.create('./fruitbox_ppo.onnx');
// Predict action
const { action_logits } = await session.run({
board_input: new ort.Tensor('float32', board_data, [1, 17, 10, 1])
});
const action = action_logits.data.indexOf(Math.max(...action_logits.data));
Files
pytorch_model.zip
: Original SB3 PPO modelfruitbox_ppo.onnx
: ONNX version for web deployment (2.95MB)model_info.json
: Model metadata and performance metrics
Training Details
- Algorithm: PPO with action masking
- Network: Custom CNN (SmallGridCNN)
- Training steps: 1,000,000
- Environment: Custom Gymnasium environment
- Action space: 8,415 possible rectangles (masked)
Repository
Source code: https://github.com/your-username/alphaapple
Citation
@misc{alphaapple2024,
title={AlphaApple: AI Agent for FruitBox Puzzle Game},
author={Your Name},
year={2024},
howpublished={\url{https://huggingface.co/AlphaApple}}
}
- Downloads last month
- 22
Evaluation results
- Mean Episode Score on FruitBox Gameself-reported77.000
- Improvement vs Random on FruitBox Gameself-reported7.1%
- Improvement vs Greedy on FruitBox Gameself-reported5.0%