KoalaSeg πŸ¨πŸ›£οΈ

Colab Inference :

Open In Colab

KOrean lAyered assistive Segmentation

Inference Demo

ν•œκ΅­ λ„λ‘œΒ·λ³΄ν–‰ ν™˜κ²½ μ „μš© Universal Segmentation λͺ¨λΈμž…λ‹ˆλ‹€.
shi-labs/oneformer_cityscapes_swin_large 기반 OneFormer ꡐ사 λͺ¨λΈμ„

  1. μˆ˜μž‘μ—… XML 폴리곀
  2. AIHUB λ„λ‘œΒ·λ³΄ν–‰ν™˜κ²½ Surface Mask(5k) + Polygon(500) λ°μ΄ν„°λ‘œ ν•™μŠ΅ν•œ ν•œκ΅­ν˜• λͺ¨λΈ
  3. Cityscapes 마슀크
    순으둜 λ ˆμ΄μ–΄λ“œ μ•™μƒλΈ”ν•˜μ—¬ μƒμ„±ν•œ GT둜 Edge-ViT 20 M 학생 λͺ¨λΈμ„ 증λ₯˜ν–ˆμŠ΅λ‹ˆλ‹€.

Model Details

  • Developed by: Team RoadSight
  • Base model: shi-labs/oneformer_cityscapes_swin_large
  • Model type: Edge-ViT 20 M + OneFormer head (semantic task)
  • Framework: πŸ€— Transformers & PyTorch

Training Data

AIHUB μΈλ„Β·λ³΄ν–‰ν™˜κ²½ 데이터 (https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=189):

  • Bounding Box: 350,000μž₯ (29μ’… μž₯μ• λ¬Ό λ°•μŠ€ μ–΄λ…Έν…Œμ΄μ…˜)
  • Polygon: 100,000μž₯ (29μ’… μž₯μ• λ¬Ό 폴리곀 μ–΄λ…Έν…Œμ΄μ…˜) β†’ 500μž₯ μ‚¬μš©
  • Surface Masking: 50,000μž₯ (λ…Έλ©΄ μƒνƒœ 마슀크) β†’ 5,000μž₯ μ‚¬μš©
  • Depth Prediction: 170,000μž₯ (μŠ€ν…Œλ ˆμ˜€ 깊이)

총 18,369μž₯ (AIHUB 5.5k + μžκ°€ 촬영 9k + Street View 3.7k) λ ˆμ΄μ–΄ 앙상블 β†’
Morph Open/Close + MedianBlur(17px) ν›„ GT 생성.


Speeds & Sizes (512Γ—512, batch=1)

Device Baseline Cityscapes Ensemble (3-layer) Custom (K-Road) koalaseg
A100 3.58 s β†’ 0.28 FPS 3.74 s β†’ 0.27 FPS 0.15 s β†’ 6.67 FPS 0.14 s β†’ 7.25 FPS
T4 5.61 s β†’ 0.18 FPS 6.01 s β†’ 0.17 FPS 0.39 s β†’ 2.60 FPS 0.31 s β†’ 3.27 FPS
CPU (i9-12900K) 124 s β†’ 0.008 FPS 150 s β†’ 0.007 FPS 26.6 s β†’ 0.038 FPS 18.4 s β†’ 0.054 FPS

Quick Start

from transformers import AutoProcessor, AutoModelForUniversalSegmentation
import torch, requests, matplotlib.pyplot as plt, numpy as np
from PIL import Image
from io import BytesIO

# 0. Load model & processor -----------------------------------
model_id = "gj5520/KoalaSeg"
proc  = AutoProcessor.from_pretrained(model_id)
model = AutoModelForUniversalSegmentation.from_pretrained(model_id).to("cuda").eval()

# 1. Download image -------------------------------------------
url  = "https://pds.joongang.co.kr/news/component/htmlphoto_mmdata/202205/21/1200738c-61c0-4a51-83c4-331f53d4dcdc.jpg"
resp = requests.get(url, stream=True)
img  = Image.open(BytesIO(resp.content)).convert("RGB")

# 2. Pre-process & inference ----------------------------------
inputs = proc(images=img, task_inputs=["semantic"], return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model(**inputs)

# 3-A. Get class-id map ---------------------------------------
idmap = proc.post_process_semantic_segmentation(
    out, target_sizes=[img.size[::-1]]
)[0].cpu().numpy()

# 3-B. Convert idmap β†’ RGB mask + overlay ---------------------
cmap      = plt.get_cmap("tab20", max(20, len(np.unique(idmap))))
mask_rgb  = np.zeros((*idmap.shape, 3), dtype=np.uint8)
for idx, cid in enumerate(np.unique(idmap)):
    if cid == 0:                  # keep background black
        continue
    mask_rgb[idmap == cid] = (np.array(cmap(idx)[:3]) * 255).astype(np.uint8)

mask_img = Image.fromarray(mask_rgb)
overlay  = Image.blend(img, mask_img, alpha=0.6)   # 0.6 β†’ mask κ°•μ‘°

# 4. Show overlay ---------------------------------------------
plt.figure(figsize=(8, 8))
plt.imshow(overlay)
plt.axis("off")
plt.show()

Intended Uses

  • μ‹œκ° μž₯애인 λŒ€μƒ λ„λ‘œ μ„Έκ·Έλ©˜ν…Œμ΄μ…˜
  • ν•œκ΅­ HD λ§΅Β·λ„λ‘œ μœ μ§€λ³΄μˆ˜ 지원
  • ν•™μˆ Β·μ—°κ΅¬ λͺ©μ μ˜ ν•œκ΅­ν˜• 데이터셋 벀치마크

Out-of-Scope

  • μ˜λ£ŒΒ·μœ„μ„±Β·μ‹€λ‚΄ λ“± λΉ„λ„λ‘œ 도메인
  • 개인 μ‹λ³„Β·κ°μ‹œ λ“± 민감 μž‘μ—…

Limitations & Risks

  • ν•œκ΅­ λ„λ‘œ μ „μš©: ν•΄μ™ΈΒ·κ·Ήμ €μ‘°λ„Β·ν­μš° λ“± ν™˜κ²½μ—μ„œ μ„±λŠ₯ μ €ν•˜
  • λΆ€λΆ„ κ°€λ¦Ό 인체 감지 λΆˆμ•ˆμ • β†’ 보쑰용으둜만 μ‚¬μš©

Citation

@misc{KoalaSeg2025, title = {KoalaSeg: Layered Distillation for Korean Road Universal Segmentation}, author = {RoadSight Team}, year = {2025}, url = {https://huggingface.co/gj5520/KoalaSeg} }

Downloads last month
21
Safetensors
Model size
219M params
Tensor type
I64
Β·
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support