Model Card for Model ID
This model is a fine-tuned version of the Llama 3.2 1B language model, specialized for question-answering tasks on PDF documents. The fine-tuning process leveraged Low-Rank Adaptation (LoRA) techniques to efficiently adapt the base model to the specific domain without modifying all the original model parameters. By integrating the LoRA adapters into the base model and merging them, the model operates as a standalone unit, eliminating the need for external adapter configurations during inference.
Model Description
Purpose and Motivation The primary motivation behind this model is to facilitate efficient and accurate extraction of information from PDF documents through natural language queries. PDFs often contain valuable information but can be challenging to query directly due to their format and structure. This model addresses that challenge by allowing users to ask questions in plain English and receive answers based on the content of the provided PDFs.
Model Architecture Base Model: Llama 3.2 1B Fine-Tuning Method: Low-Rank Adaptation (LoRA) Merged Model: The LoRA adapters have been merged into the base model weights, resulting in a single, consolidated model for ease of deployment.
- Developed by: Anoshor B. Paul
- Model type: Chat Model
- Language(s) (NLP): English
- Finetuned from model: https://huggingface.co/meta-llama/Llama-3.2-1B
Uses
Intended Use Primary Use Cases: Question answering based on the content of PDF documents. Assisting in research by extracting key information from large documents. Supporting educational purposes by answering questions from textbooks or lecture notes. User Interface Integration: Can be integrated into applications with a user-friendly interface, such as Gradio apps, to facilitate easy interaction.
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "your-username/your-model-name"
tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda' or 'cpu')
def generate_response(question, context): prompt = f"Context:\n{context}\n\n: {question}\n:" input_ids = tokenizer.encode(prompt, return_tensors='pt').to(model.device) with torch.no_grad(): output_ids = model.generate( input_ids=input_ids, max_new_tokens=200, do_sample=True, temperature=0.7, top_p=0.9, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id, ) generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True) response = generated_text.split(':')[-1].strip() return response
Example
context = "Extracted text from your PDF document." question = "What are the main conclusions of the document?" answer = generate_response(question, context) print(answer)
Training Data
https://huggingface.co/datasets/HuggingFaceH4/no_robots
[More Information Needed]
Training Procedure
Training Data The model was fine-tuned on a custom dataset comprising question-answer pairs extracted from a diverse collection of PDF documents. The PDFs covered various domains such as technical manuals, research papers, legal documents, and educational materials. The dataset was created by processing these PDFs to extract text and then generating relevant questions and answers based on the content.
Data Collection: Text extraction from publicly available PDFs using OCR and text parsing techniques. Data Processing: Text Splitting: Utilized a recursive character text splitter to divide the text into manageable chunks. Embedding Generation: Created embeddings using the sentence-transformers/all-MiniLM-L6-v2 model for similarity search. Question Generation: Curated meaningful questions that users might ask based on the content. Answer Generation: Provided accurate answers extracted directly from the text. Fine-Tuning Process Parameter-Efficient Fine-Tuning: Employed LoRA to adjust a subset of model parameters, reducing computational resources and training time. Training Configuration: Optimizer: AdamW optimizer with appropriate weight decay. Learning Rate: Set to a value suitable for fine-tuning large language models without overfitting. Batch Size: Adjusted based on hardware capabilities to maximize GPU utilization. Epochs: Trained over several epochs to ensure convergence without overfitting. Hardware Used: Trained on a high-performance GPU (e.g., NVIDIA A100) to handle the computational demands of fine-tuning a 7B parameter model. Model Capabilities Contextual Understanding: Capable of understanding and processing complex queries related to the content of PDF documents. Information Retrieval: Efficiently retrieves relevant information from PDFs by integrating a Retrieval-Augmented Generation (RAG) pipeline. Natural Language Interaction: Accepts queries in natural language and provides coherent, contextually appropriate responses. Prompt Format: Utilizes a specific prompt format with : and : tags to delineate user queries and model responses.
- Downloads last month
- 2
Model tree for Anoshor/prism-tuned
Base model
meta-llama/Llama-3.2-1B