Model Card for Llama-3.2-3B-Instruct-Bitcoin-Analyst
This repository contains a specialized version of meta-llama/Llama-3.2-3B-Instruct
, expertly fine-tuned to function as a Bitcoin and cryptocurrency market analyst. This model is the result of a multi-stage "continuation training" process, where an already specialized model was further refined on a targeted dataset to enhance its knowledge and instruction-following capabilities.
Model Details
Model Description
This model is a Causal Language Model (CLM) based on the Llama 3.2 3B Instruct architecture. It was developed through a sequential fine-tuning process designed to build upon existing domain knowledge and further improve its performance on financial and technical topics related to cryptocurrency.
The training procedure involved three key stages:
- Initial Specialization (Adapter Merge): The process began by merging a pre-existing, high-performing LoRA adapter,
tahamajs/llama-3.2-3b-instruct-bitcoin-analyst_best
, into the basemeta-llama/Llama-3.2-3B-Instruct
model. This provided a strong foundation of specialized knowledge. - Continuation Fine-Tuning (New LoRA): A new LoRA adapter was then trained on top of this already-merged model. This continuation training used the
tahamajs/bitcoin-llm-finetuning-dataset
to deepen the model's expertise. - Final Product: The final artifact is the result of this second training stage. This model card assumes the final, fully merged model is shared, containing the cumulative knowledge from all stages.
- Developed by: tahamajs
- Model type: Causal Language Model (Instruction-Tuned)
- Language(s) (NLP): English
- License: Llama 3 Community License Agreement
- Finetuned from model:
meta-llama/Llama-3.2-3B-Instruct
Model Sources [optional]
- Repository:
tahamajs/llama-3.2-3b-instruct-bitcoin-analyst
Uses
Direct Use
This model is intended for direct use as an instruction-following chatbot for topics related to Bitcoin and cryptocurrency. It can be used for question answering, analysis, and explanation of complex financial and technical concepts. For best results, prompts should be formatted using the Llama 3 chat template.
Downstream Use [optional]
This model can serve as a strong base for further fine-tuning on more specific financial tasks, such as sentiment analysis of crypto news, generating market summaries, or building a domain-specific RAG system.
Out-of-Scope Use
This model is not a financial advisor and should not be used for making real-world investment decisions. Its knowledge is limited to its training data and may not be fully up-to-date. It is not designed for general-purpose conversation outside of its specialized domain and may perform poorly on such tasks.
Bias, Risks, and Limitations
This model inherits the limitations of the base Llama 3.2 model and the biases present in its training data (which includes cryptocurrency-related discourse). In the financial domain, there is a significant risk of generating overly confident, optimistic, or pessimistic statements that could be misinterpreted as financial advice. The model may "hallucinate" facts or data points.
Recommendations
Users should critically evaluate all outputs from this model, especially when they pertain to financial metrics, historical data, or price predictions. We recommend clearly stating to any end-users that the text is generated by an AI and is not a substitute for professional financial advice.
How to Get Started with the Model
Use the code below to load the fully merged model and generate text.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Use the ID of this repository
model_id = "tahamajs/llama-3.2-3b-instruct-bitcoin-analyst"
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
# Use the Llama 3 chat template for instruction-following
messages = [
{"role": "user", "content": "What is the role of the 'difficulty adjustment' in Bitcoin's protocol and how does it maintain a consistent block time?"},
]
# Apply the chat template and tokenize
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
# Generate a response
outputs = model.generate(
input_ids,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
# Decode and print the output
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Training Details
Training Data
The second stage of fine-tuning was performed on the tahamajs/bitcoin-llm-finetuning-dataset. This dataset contains instruction-response pairs related to Bitcoin, market analysis, and blockchain technology.
Training Procedure
Preprocessing
The training data was formatted into the Llama 3 chat template using a format_chat
function. A custom RobustCompletionCollator
was used to mask the prompt and user-input tokens from the loss calculation, ensuring the model was only trained to predict the assistant's responses.
Training Hyperparameters
The continuation training was performed using the QLoRA method for memory efficiency.
- Training regime: bf16 mixed precision
Hyperparameter | Value |
---|---|
lora_r |
32 |
lora_alpha |
64 |
lora_dropout |
0.1 |
target_modules |
q_proj , k_proj , v_proj , o_proj , gate_proj , up_proj , down_proj |
learning_rate |
1e-4 |
lr_scheduler_type |
cosine |
num_train_epochs |
1 |
optimizer |
paged_adamw_32bit |
batch_size (per device) |
1 |
gradient_accumulation |
8 |
total_batch_size |
8 |
Evaluation
Quantitative evaluation has not been performed on this model version.
Technical Specifications [optional]
Model Architecture and Objective
This is a decoder-only transformer based on the Llama 3.2 architecture. It was fine-tuned using a Causal Language Modeling objective, where the model learns to predict the next token in a sequence.
Compute Infrastructure
Software
- PyTorch
- Transformers
- PEFT
- TRL
- BitsAndBytes for QLoRA
Model Card Authors [optional]
tahamajs
Model Card Contact
[More Information Needed]
Model tree for tahamajs/llama-3.2-3b-instruct-bitcoin-analyst-best_v2_2
Base model
meta-llama/Llama-3.2-3B-Instruct