File size: 7,023 Bytes

c3bbce6
ea31b9e
 
 
c3bbce6
ea31b9e
 
 
 
 
 
 
 
 
 
 
 
 
c3bbce6
 
ea31b9e
c3bbce6
ea31b9e
c3bbce6
 
 
 
 
ea31b9e
c3bbce6
ea31b9e
 
 
 
c3bbce6
ea31b9e
 
 
 
 
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c3bbce6
 
 
 
 
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
ea31b9e
c3bbce6
 
 
ea31b9e
c3bbce6
ea31b9e
c3bbce6
ea31b9e
 
 
 
 
 
 
 
 
 
 
 
 
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
 
 
ea31b9e
c3bbce6
 
 
 
 
ea31b9e
 
 
 
 
c3bbce6
 
 
ea31b9e
c3bbce6
 
 
ea31b9e

---
license: llama3
language:
- en
library_name: transformers
tags:
- llama-3
- llama-3.2
- bitcoin
- finance
- instruction-following
- fine-tuning
- merged
- lora
base_model: meta-llama/Llama-3.2-3B-Instruct
datasets:
- tahamajs/bitcoin-llm-finetuning-dataset
pipeline_tag: text-generation
---

# Model Card for Llama-3.2-3B-Instruct-Bitcoin-Analyst

This repository contains a specialized version of `meta-llama/Llama-3.2-3B-Instruct`, expertly fine-tuned to function as a **Bitcoin and cryptocurrency market analyst**. This model is the result of a multi-stage "continuation training" process, where an already specialized model was further refined on a targeted dataset to enhance its knowledge and instruction-following capabilities.

## Model Details

### Model Description

This model is a Causal Language Model (CLM) based on the Llama 3.2 3B Instruct architecture. It was developed through a sequential fine-tuning process designed to build upon existing domain knowledge and further improve its performance on financial and technical topics related to cryptocurrency.

The training procedure involved three key stages:
1.  **Initial Specialization (Adapter Merge):** The process began by merging a pre-existing, high-performing LoRA adapter, `tahamajs/llama-3.2-3b-instruct-bitcoin-analyst_best`, into the base `meta-llama/Llama-3.2-3B-Instruct` model. This provided a strong foundation of specialized knowledge.
2.  **Continuation Fine-Tuning (New LoRA):** A new LoRA adapter was then trained on top of this already-merged model. This continuation training used the `tahamajs/bitcoin-llm-finetuning-dataset` to deepen the model's expertise.
3.  **Final Product:** The final artifact is the result of this second training stage. This model card assumes the final, fully merged model is shared, containing the cumulative knowledge from all stages.

- **Developed by:** tahamajs
- **Model type:** Causal Language Model (Instruction-Tuned)
- **Language(s) (NLP):** English
- **License:** Llama 3 Community License Agreement
- **Finetuned from model:** `meta-llama/Llama-3.2-3B-Instruct`

### Model Sources [optional]

- **Repository:** `tahamajs/llama-3.2-3b-instruct-bitcoin-analyst`

## Uses

### Direct Use

This model is intended for direct use as an instruction-following chatbot for topics related to Bitcoin and cryptocurrency. It can be used for question answering, analysis, and explanation of complex financial and technical concepts. For best results, prompts should be formatted using the Llama 3 chat template.

### Downstream Use [optional]

This model can serve as a strong base for further fine-tuning on more specific financial tasks, such as sentiment analysis of crypto news, generating market summaries, or building a domain-specific RAG system.

### Out-of-Scope Use

This model is **not a financial advisor** and should not be used for making real-world investment decisions. Its knowledge is limited to its training data and may not be fully up-to-date. It is not designed for general-purpose conversation outside of its specialized domain and may perform poorly on such tasks.

## Bias, Risks, and Limitations

This model inherits the limitations of the base Llama 3.2 model and the biases present in its training data (which includes cryptocurrency-related discourse). In the financial domain, there is a significant risk of generating overly confident, optimistic, or pessimistic statements that could be misinterpreted as financial advice. The model may "hallucinate" facts or data points.

### Recommendations

Users should critically evaluate all outputs from this model, especially when they pertain to financial metrics, historical data, or price predictions. We recommend clearly stating to any end-users that the text is generated by an AI and is not a substitute for professional financial advice.

## How to Get Started with the Model

Use the code below to load the fully merged model and generate text.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Use the ID of this repository
model_id = "tahamajs/llama-3.2-3b-instruct-bitcoin-analyst"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Use the Llama 3 chat template for instruction-following
messages = [
    {"role": "user", "content": "What is the role of the 'difficulty adjustment' in Bitcoin's protocol and how does it maintain a consistent block time?"},
]

# Apply the chat template and tokenize
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate a response
outputs = model.generate(
    input_ids,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

# Decode and print the output
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
````

## Training Details

### Training Data

The second stage of fine-tuning was performed on the [tahamajs/bitcoin-llm-finetuning-dataset](https://huggingface.co/datasets/tahamajs/bitcoin-llm-finetuning-dataset). This dataset contains instruction-response pairs related to Bitcoin, market analysis, and blockchain technology.

### Training Procedure

#### Preprocessing

The training data was formatted into the Llama 3 chat template using a `format_chat` function. A custom `RobustCompletionCollator` was used to mask the prompt and user-input tokens from the loss calculation, ensuring the model was only trained to predict the assistant's responses.

#### Training Hyperparameters

The continuation training was performed using the QLoRA method for memory efficiency.

  - **Training regime:** bf16 mixed precision

| Hyperparameter | Value |
| :--- | :--- |
| `lora_r` | 32 |
| `lora_alpha` | 64 |
| `lora_dropout` | 0.1 |
| `target_modules` | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
| `learning_rate` | 1e-4 |
| `lr_scheduler_type` | cosine |
| `num_train_epochs` | 1 |
| `optimizer` | paged\_adamw\_32bit |
| `batch_size (per device)`| 1 |
| `gradient_accumulation` | 8 |
| `total_batch_size` | 8 |

## Evaluation

Quantitative evaluation has not been performed on this model version.

## Technical Specifications [optional]

### Model Architecture and Objective

This is a decoder-only transformer based on the Llama 3.2 architecture. It was fine-tuned using a Causal Language Modeling objective, where the model learns to predict the next token in a sequence.

### Compute Infrastructure

#### Software

  - [PyTorch](https://pytorch.org/)
  - [Transformers](https://github.com/huggingface/transformers)
  - [PEFT](https://github.com/huggingface/peft)
  - [TRL](https://github.com/huggingface/trl)
  - [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes) for QLoRA

## Model Card Authors [optional]

tahamajs

## Model Card Contact

[More Information Needed]