ryandt's picture
Upload LlamaForCausalLM
82d2de1 verified
metadata
datasets:
  - ryandt/essay_thesis_conversations
language:
  - en
library_name: transformers
license: mit

Model Card for Model ID

This is a quick test at building a model that engages in a more "pedagogically grounded" rhetoric when helping students brainstorm.

It was developed in about a day as a proof of concept.

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Ryan Tannenbaum and For.Education
  • Model type: Llama 3.1 8B
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: Llama 3.1 8B

Model Sources [optional]

  • Demo [optional]: Coming soon

Uses

The model uses the following formatting:

### USER: <What the user says>

### ASSISTANT: <The bot response>

...

### TERMINATE

The model is programmed to "TERMINATE" the session when it reaches the end of its conversation

Bias, Risks, and Limitations

This model uses an incredibly small dataset to tackle a very specific use case. It is a proof of concept.

Training Data

Essay Dataset

Training Procedure

Trained with AutoTrain locally on 4090 card:

Training Hyperparameters

Epochs: 5 Learning Rate: 2e-5 Train Batch Size: 2 Mixed Precision: fp16 Quantization: int8

Model Card Authors [optional]

Ryan Tannenbaum (ryandt)

Model Card Contact

Ryan Tannenbaum