Model Card for Llama-3.2-3B-Instruct-Bitcoin-Analyst

This repository contains a specialized version of meta-llama/Llama-3.2-3B-Instruct, expertly fine-tuned to function as a Bitcoin and cryptocurrency market analyst. This model is the result of a multi-stage "continuation training" process, where an already specialized model was further refined on a targeted dataset to enhance its knowledge and instruction-following capabilities.

Model Details

Model Description

This model is a Causal Language Model (CLM) based on the Llama 3.2 3B Instruct architecture. It was developed through a sequential fine-tuning process designed to build upon existing domain knowledge and further improve its performance on financial and technical topics related to cryptocurrency.

The training procedure involved three key stages:

  1. Initial Specialization (Adapter Merge): The process began by merging a pre-existing, high-performing LoRA adapter, tahamajs/llama-3.2-3b-instruct-bitcoin-analyst_best, into the base meta-llama/Llama-3.2-3B-Instruct model. This provided a strong foundation of specialized knowledge.
  2. Continuation Fine-Tuning (New LoRA): A new LoRA adapter was then trained on top of this already-merged model. This continuation training used the tahamajs/bitcoin-llm-finetuning-dataset to deepen the model's expertise.
  3. Final Product: The final artifact is the result of this second training stage. This model card assumes the final, fully merged model is shared, containing the cumulative knowledge from all stages.
  • Developed by: tahamajs
  • Model type: Causal Language Model (Instruction-Tuned)
  • Language(s) (NLP): English
  • License: Llama 3 Community License Agreement
  • Finetuned from model: meta-llama/Llama-3.2-3B-Instruct

Model Sources [optional]

  • Repository: tahamajs/llama-3.2-3b-instruct-bitcoin-analyst

Uses

Direct Use

This model is intended for direct use as an instruction-following chatbot for topics related to Bitcoin and cryptocurrency. It can be used for question answering, analysis, and explanation of complex financial and technical concepts. For best results, prompts should be formatted using the Llama 3 chat template.

Downstream Use [optional]

This model can serve as a strong base for further fine-tuning on more specific financial tasks, such as sentiment analysis of crypto news, generating market summaries, or building a domain-specific RAG system.

Out-of-Scope Use

This model is not a financial advisor and should not be used for making real-world investment decisions. Its knowledge is limited to its training data and may not be fully up-to-date. It is not designed for general-purpose conversation outside of its specialized domain and may perform poorly on such tasks.

Bias, Risks, and Limitations

This model inherits the limitations of the base Llama 3.2 model and the biases present in its training data (which includes cryptocurrency-related discourse). In the financial domain, there is a significant risk of generating overly confident, optimistic, or pessimistic statements that could be misinterpreted as financial advice. The model may "hallucinate" facts or data points.

Recommendations

Users should critically evaluate all outputs from this model, especially when they pertain to financial metrics, historical data, or price predictions. We recommend clearly stating to any end-users that the text is generated by an AI and is not a substitute for professional financial advice.

How to Get Started with the Model

Use the code below to load the fully merged model and generate text.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Use the ID of this repository
model_id = "tahamajs/llama-3.2-3b-instruct-bitcoin-analyst"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Use the Llama 3 chat template for instruction-following
messages = [
    {"role": "user", "content": "What is the role of the 'difficulty adjustment' in Bitcoin's protocol and how does it maintain a consistent block time?"},
]

# Apply the chat template and tokenize
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate a response
outputs = model.generate(
    input_ids,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

# Decode and print the output
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Training Details

Training Data

The second stage of fine-tuning was performed on the tahamajs/bitcoin-llm-finetuning-dataset. This dataset contains instruction-response pairs related to Bitcoin, market analysis, and blockchain technology.

Training Procedure

Preprocessing

The training data was formatted into the Llama 3 chat template using a format_chat function. A custom RobustCompletionCollator was used to mask the prompt and user-input tokens from the loss calculation, ensuring the model was only trained to predict the assistant's responses.

Training Hyperparameters

The continuation training was performed using the QLoRA method for memory efficiency.

  • Training regime: bf16 mixed precision
Hyperparameter Value
lora_r 32
lora_alpha 64
lora_dropout 0.1
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
learning_rate 1e-4
lr_scheduler_type cosine
num_train_epochs 1
optimizer paged_adamw_32bit
batch_size (per device) 1
gradient_accumulation 8
total_batch_size 8

Evaluation

Quantitative evaluation has not been performed on this model version.

Technical Specifications [optional]

Model Architecture and Objective

This is a decoder-only transformer based on the Llama 3.2 architecture. It was fine-tuned using a Causal Language Modeling objective, where the model learns to predict the next token in a sequence.

Compute Infrastructure

Software

Model Card Authors [optional]

tahamajs

Model Card Contact

[More Information Needed]

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for tahamajs/llama-3.2-3b-instruct-bitcoin-analyst-best_v2_2

Adapter
(429)
this model

Dataset used to train tahamajs/llama-3.2-3b-instruct-bitcoin-analyst-best_v2_2