AI & ML interests

None defined yet.

Recent Activity

Nevidu  updated a Space 12 days ago
iCIIT/README
Nevidu  published a Space 12 days ago
iCIIT/README
View all activity

Shared Task Specification: “Small Models, Big Impact” Building Compact Sinhala & Tamil LLMs (≤ 8 B Parameters)

  1. Task Overview & Objectives Goal: Foster development of compact, high-quality LLMs for Sinhala and Tamil by continual pre-training or fine-tuning open-source models with ≤ 8 billion parameters. Impact: Empower local NLP research and applications—chatbots, translation, sentiment analysis, educational tools—while lowering computational and storage barriers. Who Should Participate: Students & Academic Teams: Showcase research on model adaptation, data augmentation, multilingual/multitask training. Industry & Startups: Demonstrate practical performance in real-world pipelines; optimise inference speed, resource usage.

  2. Allowed Base Models Participants must choose one of the following (or any other fully open-source LLM ≤ 8 B params) Note: Proprietary or closed-license models (e.g., GPT-3 series, Claude) are not allowed.

  3. Data Resources and Evaluation Training Data (public): Sinhala: OSCAR‐Sinhala, Wikipedia dumps, Common Crawl subsets. Tamil: OSCAR‐Tamil, Tamil Wikipedia, CC100‐Tamil. Evaluation: Your LLM will be evaluated using intrinsic and extrinsic measures as follows: Intrinsic evaluation using Perplexity score Extrinsic evaluation using the appropriate MMLU metric You can use the given MMLU dataset and compare results in zero-shot, few-shot, and fine-tuned settings.

  4. Submission Requirements Model: HuggingFace-format upload. Scripts and Notebooks: Should be uploaded to a GitHub or HuggingFace repository. Technical Report (2-5 pages): Training details: data sources, training mechanism, epochs, batch size, learning rates. Resource usage: GPU time, list of hardware resources. Model evaluation. Analysis of strengths/limitations.

  5. Rules & Fairness Parameter Limit: Strict upper bound of 8 B parameters (model + adapter weights). Data Usage: Only public/open-license data; no private or web-scraped behind login. Reproducibility: All code, data-prep scripts, and logs must be publicly accessible by the submission deadline.

  6. How to Register & Contact Registration Form: https://forms.gle/edzfpopVvKkkF6cH8 Contact: [email protected] Phone: 076 981 1289

models 0

None public yet

datasets 0

None public yet