Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for Model ID

This model is a fine-tuned version of the Llama 3.2 1B language model, specialized for question-answering tasks on PDF documents. The fine-tuning process leveraged Low-Rank Adaptation (LoRA) techniques to efficiently adapt the base model to the specific domain without modifying all the original model parameters. By integrating the LoRA adapters into the base model and merging them, the model operates as a standalone unit, eliminating the need for external adapter configurations during inference.

Model Description

Purpose and Motivation The primary motivation behind this model is to facilitate efficient and accurate extraction of information from PDF documents through natural language queries. PDFs often contain valuable information but can be challenging to query directly due to their format and structure. This model addresses that challenge by allowing users to ask questions in plain English and receive answers based on the content of the provided PDFs.

Model Architecture Base Model: Llama 3.2 1B Fine-Tuning Method: Low-Rank Adaptation (LoRA) Merged Model: The LoRA adapters have been merged into the base model weights, resulting in a single, consolidated model for ease of deployment.

Uses

Intended Use Primary Use Cases: Question answering based on the content of PDF documents. Assisting in research by extracting key information from large documents. Supporting educational purposes by answering questions from textbooks or lecture notes. User Interface Integration: Can be integrated into applications with a user-friendly interface, such as Gradio apps, to facilitate easy interaction.

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "your-username/your-model-name"

tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda' or 'cpu')

def generate_response(question, context): prompt = f"Context:\n{context}\n\n: {question}\n:" input_ids = tokenizer.encode(prompt, return_tensors='pt').to(model.device) with torch.no_grad(): output_ids = model.generate( input_ids=input_ids, max_new_tokens=200, do_sample=True, temperature=0.7, top_p=0.9, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id, ) generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True) response = generated_text.split(':')[-1].strip() return response

Example

context = "Extracted text from your PDF document." question = "What are the main conclusions of the document?" answer = generate_response(question, context) print(answer)

Training Data

https://huggingface.co/datasets/HuggingFaceH4/no_robots

[More Information Needed]

Training Procedure

Training Data The model was fine-tuned on a custom dataset comprising question-answer pairs extracted from a diverse collection of PDF documents. The PDFs covered various domains such as technical manuals, research papers, legal documents, and educational materials. The dataset was created by processing these PDFs to extract text and then generating relevant questions and answers based on the content.

Data Collection: Text extraction from publicly available PDFs using OCR and text parsing techniques. Data Processing: Text Splitting: Utilized a recursive character text splitter to divide the text into manageable chunks. Embedding Generation: Created embeddings using the sentence-transformers/all-MiniLM-L6-v2 model for similarity search. Question Generation: Curated meaningful questions that users might ask based on the content. Answer Generation: Provided accurate answers extracted directly from the text. Fine-Tuning Process Parameter-Efficient Fine-Tuning: Employed LoRA to adjust a subset of model parameters, reducing computational resources and training time. Training Configuration: Optimizer: AdamW optimizer with appropriate weight decay. Learning Rate: Set to a value suitable for fine-tuning large language models without overfitting. Batch Size: Adjusted based on hardware capabilities to maximize GPU utilization. Epochs: Trained over several epochs to ensure convergence without overfitting. Hardware Used: Trained on a high-performance GPU (e.g., NVIDIA A100) to handle the computational demands of fine-tuning a 7B parameter model. Model Capabilities Contextual Understanding: Capable of understanding and processing complex queries related to the content of PDF documents. Information Retrieval: Efficiently retrieves relevant information from PDFs by integrating a Retrieval-Augmented Generation (RAG) pipeline. Natural Language Interaction: Accepts queries in natural language and provides coherent, contextually appropriate responses. Prompt Format: Utilizes a specific prompt format with : and : tags to delineate user queries and model responses.

Downloads last month
2
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Anoshor/prism-tuned

Finetuned
(93)
this model

Dataset used to train Anoshor/prism-tuned