ckandemir's picture
Update README.md
6d4f724
---
license: mit
base_model: gpt2-medium
tags:
- generated_from_trainer
model-index:
- name: gpt2-medium-finetuned-contract-gen
results: []
pipeline_tag: text-generation
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# gpt2-medium-finetuned-contract-gen
## Overview
`gpt2-medium-finetuned-contract-gen` is a model specialized in generating Solidity contract codes. Derived from the [gpt2-medium](https://huggingface.co/gpt2-medium) model by Hugging Face, it's been meticulously trained on an extensive set of Solidity contracts and patterns, making it apt for assisting in drafting or suggesting contract structures.
## Model Description
This model has been designed specifically for generating Solidity contracts. Being a derivative of the `gpt2-medium` model, it retains the broader capabilities of the parent model while demonstrating a keen proficiency in understanding and generating Solidity-centric texts.
### Performance
The model reported a loss of `0.3127` on the evaluation set.
## Intended Uses & Limitations
### Intended Uses:
1. Assist developers by auto-generating contract code snippets based on prompts.
2. Help in understanding and drafting complex contract structures.
### Limitations:
1. The generated code must be reviewed for security and functional correctness.
2. The clarity of the generated code largely depends on the specificity of the prompt.
## Training Details
### Dataset
The model was fine-tuned on an undisclosed dataset comprised of a range of Solidity contracts.
### Training Hyperparameters:
- Learning Rate: `5e-05`
- Train Batch Size: `4`
- Evaluation Batch Size: `4`
- Seed: `42`
- Optimizer: Adam (`betas=(0.9,0.999)`, `epsilon=1e-08`)
- Learning Rate Scheduler: Cosine with restarts
- Warmup Steps: `241`
- Epochs: `4`
### Training Results:
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.4744 | 0.21 | 1000 | 0.4736 |
| 0.467 | 0.41 | 2000 | 0.4146 |
| 0.4089 | 0.62 | 3000 | 0.3852 |
| 0.4018 | 0.83 | 4000 | 0.3688 |
| 0.3475 | 1.04 | 5000 | 0.3523 |
| 0.2751 | 1.24 | 6000 | 0.3434 |
| 0.2966 | 1.45 | 7000 | 0.3334 |
| 0.292 | 1.66 | 8000 | 0.3230 |
| 0.2899 | 1.87 | 9000 | 0.3200 |
| 0.2508 | 2.07 | 10000 | 0.3164 |
| 0.28 | 2.28 | 11000 | 0.3127 |
### Dependencies:
- Transformers: `4.31.0`
- Pytorch: `2.0.1+cu118`
- Datasets: `2.14.2`
- Tokenizers: `0.13.3`
## How to Use
If you wish to use this model to generate Solidity contract code, follow the steps below:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")
model = AutoModelForCausalLM.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")
# Input your code prompt
input_text = "contract MyToken"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
sample_output = model.generate(input_ids, do_sample=True, max_length=400, num_return_sequences=1, temperature=0.7)
# Decode and print the generated text
generated_text = tokenizer.decode(sample_output[0], skip_special_tokens=True)
print(generated_text)