convaiinnovations
/

hindi-foundational-model-base

@@ -1,4 +1,3 @@
 ---
 license: mit # Or choose another appropriate license: https://huggingface.co/docs/hub/repositories-licenses
 language: hi
@@ -7,184 +6,333 @@ tags:
 - text-generation
 - causal-lm
 - custom-model
-# Add more specific tags if applicable, e.g., based on training data domain
 pipeline_tag: text-generation
-# Specify model size if known from config
 ---
 # Hindi Causal Language Model (convaiinnovations/hindi-foundational-model-base)
-This repository contains a custom-trained Hindi Causal Language Model.
 ## Model Description
-*   **Architecture:** Custom Transformer (12 layers, hidden=768, 16 heads, ffn=3072, act=swiglu, norm=rmsnorm) based on the `HindiCausalLM` class with modified attention mechanisms and Hindi-specific optimizations including multi-resolution attention to capture both character-level and word-level patterns, morphology-aware feed-forward layers, and script-mix processing for Hindi-English code-mixing.
-*   **Language:** Hindi (hi)
-*   **Training Data:** The model was trained on a diverse corpus of 2.7 million high-quality Hindi text samples from multiple sources including IITB Parallel Corpus (1.2M sentences), Samanantar (750K samples), Oscar Hindi (450K sentences), CC-100 Hindi (300K sentences), Hindi Wikipedia (150K articles), Hindi news articles (100K pieces), XNLI Hindi (50K premise-hypothesis pairs), IndicGLUE (30K samples), and Hindi literature (5K passages).
-*   **Tokenizer:** SentencePiece trained on Hindi text (`tokenizer.model`). Vocab Size: 16000
-*   **Model Details:** Trained for 2 epochs with hidden size:768, num_layers=12, block_size=512, batch_size=64, learning_rate=5e-5,swiglu activation, rope positional encoding and rms normalization.
 ## How to Use
-**⚠️ Important:** This model uses custom Python classes (`HindiCausalLM`, `HindiCausalLMConfig`, `SentencePieceTokenizerWrapper`) which are **not** part of the standard Hugging Face `transformers` library. To use this model, you **must** have the Python files defining these classes (e.g., `hindi_language_model.py`, `hindi_embeddings.py`) available in your Python environment.
 ```python
 import os
-import json
-import torch
-import numpy as np
 from huggingface_hub import hf_hub_download
-# --- ENSURE THESE CLASSES ARE DEFINED OR IMPORTED ---
-# You MUST have hindi_language_model.py and hindi_embeddings.py available
 try:
-    from hindi_language_model import HindiCausalLM, HindiCausalLMConfig
-    from hindi_embeddings import SentencePieceTokenizerWrapper
-    print("Custom classes imported successfully.")
-except ImportError:
-    print("ERROR: Cannot import custom classes.")
-    print("Please place hindi_language_model.py and hindi_embeddings.py in your working directory or Python path.")
-    # Define minimal dummy classes to potentially allow script execution, but loading will fail
-    class SentencePieceTokenizerWrapper: pass
-    class HindiCausalLMConfig: pass
-    class HindiCausalLM(torch.nn.Module): pass
-    # Exit if classes are truly needed
-    # exit()
-# --- Configuration ---
-repo_id = "convaiinnovations/hindi-foundational-model-base"
-# model_dir = "./downloaded_model" # Example download location
-# os.makedirs(model_dir, exist_ok=True)
-# Use current directory if preferred
-model_dir = "."
-# --- Download Files ---
-print(f"Downloading files for {repo_id} to '{os.path.abspath(model_dir)}'...")
 try:
-    config_path = hf_hub_download(repo_id=repo_id, filename="config.json", local_dir=model_dir, local_dir_use_symlinks=False)
-    tokenizer_path = hf_hub_download(repo_id=repo_id, filename="tokenizer.model", local_dir=model_dir, local_dir_use_symlinks=False)
-    # Try safetensors first, then bin
-    using_safetensors = True
-    try: weights_path = hf_hub_download(repo_id=repo_id, filename="model.safetensors", local_dir=model_dir, local_dir_use_symlinks=False)
-    except Exception: # More specific: from huggingface_hub.utils import EntryNotFoundError
-        try: weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin", local_dir=model_dir, local_dir_use_symlinks=False); using_safetensors = False
-        except Exception as e_inner: raise FileNotFoundError(f"Could not download weights (.safetensors or .bin): {e_inner}") from e_inner
-except Exception as e: raise RuntimeError(f"Failed to download files from Hub: {e}") from e
-print("Files downloaded.")
-# --- Load Components ---
-device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-print(f"Using device: {device}")
 try:
-    # 1. Load Tokenizer
-    tokenizer = SentencePieceTokenizerWrapper(tokenizer_path) # Assumes constructor takes path
-    # 2. Load Config
-    with open(config_path, 'r', encoding='utf-8') as f: config_dict = json.load(f)
-    if hasattr(HindiCausalLMConfig, 'from_dict'): config = HindiCausalLMConfig.from_dict(config_dict)
-    else: config = HindiCausalLMConfig(**config_dict) # Assumes __init__ takes kwargs
-    # 3. Load Model
-    model = HindiCausalLM(config) # Instantiate model
-    if using_safetensors:
-        from safetensors.torch import load_file
-        state_dict = load_file(weights_path, device="cpu")
     else:
-        state_dict = torch.load(weights_path, map_location="cpu")
-    model.load_state_dict(state_dict, strict=True) # Use strict=True after training
-    del state_dict
-    model.to(device)
-    model.eval()
-    print("Model and tokenizer loaded successfully.")
-except Exception as e:
-    print(f"ERROR: Failed loading model components: {e}")
-    # Add more specific error handling if needed
-    exit()
-# --- Example Inference ---
-prompt = "भारत की संस्कृति" # Example prompt
-max_new_tokens = 60 # Generate N new tokens
-temperature = 0.7
-top_k = 50
-seed = 42
-print(f"\nGenerating text for prompt: '{prompt}'...")
-torch.manual_seed(seed)
-np.random.seed(seed)
-if device.type == 'cuda': torch.cuda.manual_seed_all(seed)
-try:
-    # Encoding (use the correct method from your wrapper)
-    if hasattr(tokenizer, '__call__'): encoded = tokenizer(prompt, return_tensors=None); input_ids = encoded.get('input_ids')
-    elif hasattr(tokenizer, 'sp_model') and hasattr(tokenizer.sp_model, 'EncodeAsIds'): input_ids = tokenizer.sp_model.EncodeAsIds(prompt)
-    else: raise AttributeError("Tokenizer lacks encoding method.")
-    assert input_ids, "Encoding failed"
-    bos_id = getattr(tokenizer, 'bos_token_id', 1)
-    if bos_id is not None: input_ids = [bos_id] + input_ids
-    input_tensor = torch.tensor([input_ids], dtype=torch.long, device=device)
-    generated_ids = input_tensor
-    with torch.no_grad():
-        for _ in range(max_new_tokens):
-            outputs = model(input_ids=generated_ids)
-            # Access logits
-            if isinstance(outputs, dict) and 'logits' in outputs: logits = outputs['logits']
-            elif hasattr(outputs, 'logits'): logits = outputs.logits
-            else: raise TypeError("Model output format error.")
-            next_token_logits = logits[:, -1, :]
-            # Sampling
-            if temperature > 0: scaled_logits = next_token_logits / temperature
-            else: scaled_logits = next_token_logits
-            if top_k > 0: kth_vals, _ = torch.topk(scaled_logits, k=top_k); scaled_logits[scaled_logits < kth_vals[:, -1].unsqueeze(-1)] = -float("Inf")
-            probs = torch.softmax(scaled_logits, dim=-1)
-            next_token_id = torch.multinomial(probs, num_samples=1)
-            generated_ids = torch.cat([generated_ids, next_token_id], dim=1)
-            # Check EOS
-            eos_id = getattr(tokenizer, 'eos_token_id', 2)
-            if eos_id is not None and next_token_id.item() == eos_id: break
-    # Decoding
-    output_ids = generated_ids[0].cpu().tolist()
-    # Remove special tokens
-    if bos_id and output_ids and output_ids[0] == bos_id: output_ids = output_ids[1:]
-    if eos_id and output_ids and output_ids[-1] == eos_id: output_ids = output_ids[:-1]
-    # Use appropriate decode method
-    if hasattr(tokenizer, 'sp_model') and hasattr(tokenizer.sp_model, 'DecodeIds'): generated_text = tokenizer.sp_model.DecodeIds(output_ids)
-    elif hasattr(tokenizer, 'decode'): generated_text = tokenizer.decode(output_ids)
-    else: raise AttributeError("Tokenizer lacks decoding method.")
-    print("\n--- Generated Text ---")
-    # Print prompt + generated text for context
-    print(prompt + generated_text)
-    print("----------------------")
-except Exception as e:
-    print(f"\nERROR during example inference: {e}")
 ```
 ## Limitations and Biases
-This model was trained on a diverse corpus of Hindi text from sources including IITB Parallel Corpus, Samanantar, Oscar Hindi, CC-100 Hindi, Hindi Wikipedia, news articles, XNLI Hindi, IndicGLUE, and Hindi literature. As such, it may reflect biases present in that data, including potential cultural, gender, or regional biases found in these source materials.
-The model's performance is limited by its architecture (12 layers, hidden=768, 16 heads, ffn=3072, act=swiglu, norm=rmsnorm) and the size of the training dataset.
-It may generate repetitive, nonsensical, or factually incorrect text.
-The model uses a weighted pooling strategy with sensitivity to Hindi's SOV structure, but may still struggle with complex semantic relationships in longer texts.
-As noted in the DeepRAG research paper, the model may have particular difficulties with cultural concepts that lack direct English translations, idiomatic expressions specific to Hindi, and formal/informal speech distinctions.
 Please use this model responsibly.
-Model trained using custom scripts.

 ---
 license: mit # Or choose another appropriate license: https://huggingface.co/docs/hub/repositories-licenses
 language: hi
 - text-generation
 - causal-lm
 - custom-model
 pipeline_tag: text-generation
 ---
 # Hindi Causal Language Model (convaiinnovations/hindi-foundational-model-base)
+This repository contains a custom-trained Hindi Causal Language Model designed for Hindi text generation.
 ## Model Description
+- **Architecture:** Custom Transformer (12 layers, hidden=768, 16 heads, ffn=3072, act=swiglu, norm=rmsnorm) based on the `HindiCausalLM` class with Hindi-specific optimizations:
+  - Multi-resolution attention to capture both character-level and word-level patterns
+  - Morphology-aware feed-forward layers
+  - Script-mix processing for Hindi-English code-mixing
+- **Language:** Hindi (hi)
+- **Training Data:** 2.7 million high-quality Hindi text samples from:
+  - IITB Parallel Corpus (1.2M sentences)
+  - Samanantar (750K samples)
+  - Oscar Hindi (450K sentences)
+  - CC-100 Hindi (300K sentences)
+  - Hindi Wikipedia (150K articles)
+  - Hindi news articles (100K pieces)
+  - XNLI Hindi (50K premise-hypothesis pairs)
+  - IndicGLUE (30K samples)
+  - Hindi literature (5K passages)
+- **Tokenizer:** SentencePiece trained on Hindi text with vocab size of 16,000
+- **Training Details:** 2 epochs, hidden size=768, num_layers=12, block_size=512, batch_size=64, learning_rate=5e-5, swiglu activation, rope positional encoding, and rms normalization
 ## How to Use
+**⚠️ Important:** This model uses custom Python classes (`HindiCausalLM`, `HindiCausalLMConfig`, `SentencePieceTokenizerWrapper`) which are **not** part of the standard Hugging Face `transformers` library. The custom Python files are included in this repository.
+### Download Required Files
 ```python
 import os
 from huggingface_hub import hf_hub_download
+# Configuration
+repo_id = "convaiinnovations/hindi-foundational-model-base"
+model_dir = "."  # Use current directory for downloaded files
+# Download model files
+print(f"Downloading files for {repo_id}...")
+config_path = hf_hub_download(repo_id=repo_id, filename="config.json", local_dir=model_dir)
+tokenizer_path = hf_hub_download(repo_id=repo_id, filename="tokenizer.model", local_dir=model_dir)
+# Download custom module files (these are crucial!)
+hindi_model_path = hf_hub_download(repo_id=repo_id, filename="hindi_language_model.py", local_dir=model_dir)
+hindi_embeddings_path = hf_hub_download(repo_id=repo_id, filename="hindi_embeddings.py", local_dir=model_dir)
+# Try safetensors first, then bin
 try:
+    weights_path = hf_hub_download(repo_id=repo_id, filename="model.safetensors", local_dir=model_dir)
+    using_safetensors = True
+except:
+    weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin", local_dir=model_dir)
+    using_safetensors = False
+print("All necessary files downloaded.")
+```
+### Debug and Inference Script
+```python
+import os
+import json
+import torch
+import argparse # Keep argparse for potential future use
+import numpy as np
+import time
+import traceback  # For detailed exception info
+# Try importing safetensors
 try:
+    import safetensors.torch
+    SAFE_TENSORS_AVAILABLE = True
+except ImportError:
+    SAFE_TENSORS_AVAILABLE = False
+print("[INFO] --- Debug Inference Script Started ---")
+if SAFE_TENSORS_AVAILABLE: print("[INFO] safetensors library found.")
+else: print("[WARNING] safetensors library not found.")
+# --- Attempt to import custom modules ---
+print("[DEBUG] Attempting to import custom modules...")
 try:
+    from hindi_language_model import HindiCausalLM, HindiCausalLMConfig
+    from hindi_embeddings import SentencePieceTokenizerWrapper
+    print("[INFO] Successfully imported custom modules.")
+except ImportError as e:
+    print(f"[ERROR] Failed to import custom modules: {e}"); traceback.print_exc()
+# --- End Custom Module Import ---
+# --- Main Generation Function Definition ---
+def run_generation(
+    model_path: str,
+    prompt: str,
+    max_len: int,
+    temp: float,
+    top_k: int,
+    seed: int,
+    device_str: str
+):
+    """Loads model and generates text, printing debug info."""
+    print(f"\nINFO: --- Starting Generation ---")
+    print(f"[DEBUG] Args: path='{model_path}', max_len={max_len}, temp={temp}, top_k={top_k}, seed={seed}, device='{device_str}'")
+    # --- Setup ---
+    t_start_setup = time.time()
+    try:
+        torch.manual_seed(seed); np.random.seed(seed); device = torch.device(device_str)
+        if device.type == 'cuda': torch.cuda.manual_seed_all(seed)
+        print(f"[INFO] Using device: {device}")
+        print(f"[DEBUG] Setup took {time.time()-t_start_setup:.4f}s")
+    except Exception as e: print(f"[ERROR] Device/Seed setup failed: {e}"); traceback.print_exc(); return None
+    # --- Load Tokenizer ---
+    print("\n[INFO] --- Loading Tokenizer ---")
+    t_start_load = time.time(); tokenizer = None
+    try:
+        tokenizer_model_file = os.path.join(model_path, "tokenizer.model")
+        print(f"[DEBUG] Looking for tokenizer at: {tokenizer_model_file}")
+        assert os.path.exists(tokenizer_model_file), "tokenizer.model not found!"
+        tokenizer = SentencePieceTokenizerWrapper(tokenizer_model_file) # Use imported class
+        print(f"[INFO] Tokenizer loaded. Vocab: {getattr(tokenizer, 'vocab_size', 'N/A')}")
+        # Get BOS/EOS (handle if missing)
+        bos_id = getattr(tokenizer, 'bos_token_id', 1) # Default 1
+        eos_id = getattr(tokenizer, 'eos_token_id', 2) # Default 2
+        print(f"[INFO] BOS ID: {bos_id}, EOS ID: {eos_id}")
+    except Exception as e: print(f"[ERROR] Tokenizer loading failed: {e}"); traceback.print_exc(); return None
+    # --- Load Config ---
+    print("\n[INFO] --- Loading Config ---")
+    lm_config = None
+    try:
+        config_file = os.path.join(model_path, "config.json")
+        print(f"[DEBUG] Looking for config at: {config_file}")
+        assert os.path.exists(config_file), "config.json not found!"
+        with open(config_file, 'r', encoding='utf-8') as f: config_dict = json.load(f)
+        print(f"[DEBUG] Config JSON loaded.")
+        # Check/fix vocab size
+        tok_vocab = getattr(tokenizer, 'vocab_size', None)
+        if tok_vocab and 'vocab_size' in config_dict and config_dict['vocab_size'] != tok_vocab: print(f"[WARN] Config/Tokenizer vocab mismatch. Using tokenizer size: {tok_vocab}"); config_dict['vocab_size'] = tok_vocab
+        # Instantiate config
+        if hasattr(HindiCausalLMConfig, 'from_dict'): lm_config = HindiCausalLMConfig.from_dict(config_dict)
+        else: lm_config = HindiCausalLMConfig(**config_dict)
+        print("[INFO] Model config loaded.")
+    except Exception as e: print(f"[ERROR] Config loading failed: {e}"); traceback.print_exc(); return None
+    # --- Load Model ---
+    print("\n[INFO] --- Loading Model ---")
+    model = None
+    try:
+        print(f"[DEBUG] Instantiating {HindiCausalLM.__name__}...")
+        model = HindiCausalLM(lm_config); print(f"[INFO] Model structure created.")
+        weights_file = None; s_path = os.path.join(model_path, "model.safetensors"); b_path = os.path.join(model_path, "pytorch_model.bin")
+        print(f"[DEBUG] Checking weights: {s_path} (exists: {os.path.exists(s_path)}), {b_path} (exists: {os.path.exists(b_path)})")
+        if SAFE_TENSORS_AVAILABLE and os.path.exists(s_path): weights_file = s_path
+        elif os.path.exists(b_path): weights_file = b_path
+        else: raise FileNotFoundError("Model weights (.safetensors or .bin) not found!")
+        print(f"[INFO] Loading weights from: {weights_file}")
+        if weights_file.endswith(".safetensors"): state_dict = safetensors.torch.load_file(weights_file, device="cpu")
+        else: state_dict = torch.load(weights_file, map_location="cpu")
+        print(f"[DEBUG] State dict loaded to CPU. Keys: {len(state_dict)}")
+        try: load_res = model.load_state_dict(state_dict, strict=True)
+        except RuntimeError as e_load: print(f"[WARN] Strict load failed: {e_load}. Trying non-strict."); load_res = model.load_state_dict(state_dict, strict=False)
+        missing = getattr(load_res, "missing_keys", []); unexpected = getattr(load_res, "unexpected_keys", [])
+        print(f"[INFO] State dict loaded. Missing: {len(missing)}. Unexpected: {len(unexpected)}")
+        if missing: print(f"[WARN] Missing keys: {missing[:5]}...")
+        if unexpected: print(f"[WARN] Unexpected keys: {unexpected[:5]}...")
+        del state_dict; model.to(device); model.eval()
+        print("[INFO] Model loaded to device and set to eval mode.")
+        print(f"[DEBUG] Tokenizer+Config+Model loading took {time.time()-t_start_load:.2f}s")
+    except Exception as e: print(f"[ERROR] Model loading failed: {e}"); traceback.print_exc(); return None
+    # --- Generation ---
+    print("\n[INFO] --- Starting Text Generation ---")
+    t_start_gen = time.time()
+    print(f"[INFO] Prompt: \"{prompt}\"")
+    try:
+        print("[DEBUG] Encoding prompt...")
+        # Use __call__ or sp_model.EncodeAsIds
+        if hasattr(tokenizer, '__call__'):
+             print("DEBUG: Trying tokenizer(prompt)...")
+             encoded_result = tokenizer(prompt, return_tensors=None)
+             if isinstance(encoded_result, dict) and 'input_ids' in encoded_result: input_ids = encoded_result['input_ids']
+             else: print(f"DEBUG: __call__ result type {type(encoded_result)} unexpected. Trying sp_model.EncodeAsIds...");
+             if hasattr(tokenizer, 'sp_model') and hasattr(tokenizer.sp_model, 'EncodeAsIds'): input_ids = tokenizer.sp_model.EncodeAsIds(prompt)
+             else: raise AttributeError("Cannot find suitable encoding method (__call__ or sp_model.EncodeAsIds)")
+        elif hasattr(tokenizer, 'sp_model') and hasattr(tokenizer.sp_model, 'EncodeAsIds'):
+             print("DEBUG: Trying tokenizer.sp_model.EncodeAsIds...")
+             input_ids = tokenizer.sp_model.EncodeAsIds(prompt)
+        else: raise AttributeError("Cannot find suitable encoding method")
+        print(f"[DEBUG] Prompt token IDs: {input_ids}")
+        if bos_id is not None: print(f"[DEBUG] Prepending BOS {bos_id}"); input_ids = [bos_id] + input_ids
+        input_tensor = torch.tensor([input_ids], dtype=torch.long, device=device); print(f"[DEBUG] Initial input tensor shape: {input_tensor.shape}")
+        generated_ids = input_tensor
+        print("[DEBUG] Starting generation loop...")
+        with torch.no_grad():
+            for i in range(max_len - len(input_ids)):
+                step = i + 1; print(f"\nDEBUG: --- Step {step}/{max_len - len(input_ids)} | Current len: {generated_ids.shape[1]} ---")
+                t_fwd = time.time();
+                # --- FORWARD CALL AND LOGIT EXTRACTION ---
+                outputs = model(input_ids=generated_ids) # model call
+                # *** CORRECTED LOGIT ACCESS ***
+                if isinstance(outputs, dict) and 'logits' in outputs:
+                    logits = outputs['logits'] # Access via key if output is dict
+                    print(f"DEBUG: Fwd pass {time.time()-t_fwd:.4f}s. Accessed dict['logits'].")
+                elif hasattr(outputs, 'logits'):
+                    logits = outputs.logits # Access via attribute if output is object
+                    print(f"DEBUG: Fwd pass {time.time()-t_fwd:.4f}s. Accessed outputs.logits.")
+                else:
+                    print(f"[ERROR] Model output type is {type(outputs)}, and does not contain 'logits'.")
+                    raise TypeError("Model output format error.")
+                # *** END CORRECTION ***
+                next_token_logits = logits[:, -1, :]; print(f"DEBUG: Next logits shape: {next_token_logits.shape}")
+                # --- Sampling ---
+                if temp > 0: scaled_logits = next_token_logits / temp
+                else: scaled_logits = next_token_logits # Greedy
+                if top_k > 0: kth_vals, _ = torch.topk(scaled_logits, k=top_k, dim=-1); scaled_logits[scaled_logits < kth_vals[:, -1].unsqueeze(-1)] = -float("Inf")
+                probs = torch.softmax(scaled_logits, dim=-1); next_token_id = torch.multinomial(probs, num_samples=1); print(f"DEBUG: Sampled ID: {next_token_id.item()}")
+                generated_ids = torch.cat([generated_ids, next_token_id], dim=1)
+                if next_token_id.item() == eos_id: print(f"INFO: EOS token {eos_id} generated."); break
+            else: print(f"INFO: Reached max length {max_len}.")
+        # --- Decode ---
+        print("\nDEBUG: --- Post-processing ---")
+        output_ids = generated_ids[0].cpu().tolist(); print(f"[DEBUG] Raw output IDs: {output_ids}")
+        processed_ids = output_ids
+        if bos_id and processed_ids and processed_ids[0] == bos_id: print("[DEBUG] Removing BOS"); processed_ids = processed_ids[1:]
+        if eos_id and processed_ids and processed_ids[-1] == eos_id: print("[DEBUG] Removing EOS"); processed_ids = processed_ids[:-1]
+        print(f"[DEBUG] Processed IDs: {processed_ids}")
+        print("[INFO] Decoding...")
+        # Use sp_model.DecodeIds or decode
+        if hasattr(tokenizer, 'sp_model') and hasattr(tokenizer.sp_model, 'DecodeIds'): print("DEBUG: Decoding using tokenizer.sp_model.DecodeIds..."); generated_text = tokenizer.sp_model.DecodeIds(processed_ids)
+        elif hasattr(tokenizer, 'decode'): print("DEBUG: Decoding using tokenizer.decode..."); generated_text = tokenizer.decode(processed_ids)
+        else: raise AttributeError("Cannot find suitable decoding method")
+        print(f"[DEBUG] Decoded text: '{generated_text}'")
+        print(f"[INFO] Generation successful ({time.time() - t_start_gen:.2f}s).")
+        return generated_text
+    except Exception as e: print(f"ERROR: Generation loop error: {e}"); traceback.print_exc(); return None
+# --- End Generation Function Definition ---
+# --- Main Execution Block ---
+if __name__ == "__main__":
+    # --- Parameters ---
+    model_dir = "."  # Use current directory if files are downloaded here
+    prompt = "गंगा नदी"
+    max_len = 80
+    temp = 2
+    top_k = 45
+    seed = 42
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print("\n[INFO] --- Simple Hindi Text Generation Script ---")
+    print(f"[INFO] Model Dir: {model_dir}")
+    print(f"[INFO] Prompt: \"{prompt}\"")
+    print(f"[INFO] Max Length: {max_len}")
+    print(f"[INFO] Temperature: {temp}")
+    print(f"[INFO] Top-K: {top_k}")
+    print(f"[INFO] Seed: {seed}")
+    print(f"[INFO] Device: {device}")
+    print("-" * 30)
+    # --- Validate Path ---
+    if not os.path.isdir(model_dir): print(f"[ERROR] Model directory not found: {model_dir}"); exit(1)
+    # --- Run Generation ---
+    if 'run_generation' in locals():
+        generated_output = run_generation(
+            model_path=model_dir, prompt=prompt, max_len=max_len,
+            temp=temp, top_k=top_k, seed=seed, device_str=device
+        )
+    else: print("[ERROR] run_generation function is not defined!"); generated_output = None
+    # --- Print Result ---
+    print("\n" + "="*20 + " Final Generation Result " + "="*20)
+    if generated_output is not None:
+        print(f"Prompt: {prompt}")
+        print("-" * (40 + len(" Final Generation Result ")))
+        print("Generated Text:")
+        print(generated_output)
     else:
+        print("\n[FAILURE] Text generation failed. Check print statements above.")
+    print("=" * (40 + len(" Final Generation Result ")))
+```
+## Example Outputs
+### Basic Example
+```python
+prompt = "हिंदी भाषा"
+# Output: "हिंदी भाषा भारत की सबसे महत्वपूर्ण भाषाओं में से एक है। यह भारत के उत्तर भारत के राज्यों में मुख्य भाषा के रूप में बोली जाती है..."
+```
+### Creative Writing Example
+```python
+prompt = "एक बार की बात है"
+# Output: "एक बार की बात है, जब मैं छोटा था, तब मेरे दादाजी मुझे एक कहानी सुनाया करते थे। वह कहानी एक ऐसे राजा की थी जो अपने राज्य में..."
 ```
 ## Limitations and Biases
+- The model may reflect biases present in its training data, including potential cultural, gender, or regional biases found in source materials.
+- Performance is limited by its architecture size (12 layers, hidden=768) and training dataset size.
+- May generate repetitive, nonsensical, or factually incorrect text.
+- Uses weighted pooling with sensitivity to Hindi's SOV structure, but may struggle with complex semantic relationships in longer texts.
+- May have particular difficulties with:
+  - Cultural concepts lacking direct English translations
+  - Idiomatic expressions specific to Hindi
+  - Formal/informal speech distinctions
+  - Handling Hindi-specific morphological complexities
+## License
+This model is licensed under the MIT License.
 Please use this model responsibly.