tamunna 's Collections

vlm_grounding

Tahsir PhD work


  • Note AG-KD (Abnormality Grounding with Knowledge Descriptions) is a compact 0.23B vision-language model designed for abnormality grounding in medical images. Despite its small size, it delivers performance comparable to 7B state-of-the-art medical VLMs. Our approach integrates structured knowledge descriptions into prompts, enhancing the model’s ability to localize medical abnormalities in images.