|
--- |
|
license: mit |
|
base_model: gpt2-medium |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: gpt2-medium-finetuned-contract-gen |
|
results: [] |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# gpt2-medium-finetuned-contract-gen |
|
|
|
## Overview |
|
`gpt2-medium-finetuned-contract-gen` is a model specialized in generating Solidity contract codes. Derived from the [gpt2-medium](https://huggingface.co/gpt2-medium) model by Hugging Face, it's been meticulously trained on an extensive set of Solidity contracts and patterns, making it apt for assisting in drafting or suggesting contract structures. |
|
|
|
## Model Description |
|
This model has been designed specifically for generating Solidity contracts. Being a derivative of the `gpt2-medium` model, it retains the broader capabilities of the parent model while demonstrating a keen proficiency in understanding and generating Solidity-centric texts. |
|
|
|
### Performance |
|
The model reported a loss of `0.3127` on the evaluation set. |
|
|
|
## Intended Uses & Limitations |
|
|
|
### Intended Uses: |
|
1. Assist developers by auto-generating contract code snippets based on prompts. |
|
2. Help in understanding and drafting complex contract structures. |
|
|
|
### Limitations: |
|
1. The generated code must be reviewed for security and functional correctness. |
|
2. The clarity of the generated code largely depends on the specificity of the prompt. |
|
|
|
## Training Details |
|
|
|
### Dataset |
|
The model was fine-tuned on an undisclosed dataset comprised of a range of Solidity contracts. |
|
|
|
### Training Hyperparameters: |
|
- Learning Rate: `5e-05` |
|
- Train Batch Size: `4` |
|
- Evaluation Batch Size: `4` |
|
- Seed: `42` |
|
- Optimizer: Adam (`betas=(0.9,0.999)`, `epsilon=1e-08`) |
|
- Learning Rate Scheduler: Cosine with restarts |
|
- Warmup Steps: `241` |
|
- Epochs: `4` |
|
|
|
### Training Results: |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | |
|
|:-------------:|:-----:|:-----:|:---------------:| |
|
| 0.4744 | 0.21 | 1000 | 0.4736 | |
|
| 0.467 | 0.41 | 2000 | 0.4146 | |
|
| 0.4089 | 0.62 | 3000 | 0.3852 | |
|
| 0.4018 | 0.83 | 4000 | 0.3688 | |
|
| 0.3475 | 1.04 | 5000 | 0.3523 | |
|
| 0.2751 | 1.24 | 6000 | 0.3434 | |
|
| 0.2966 | 1.45 | 7000 | 0.3334 | |
|
| 0.292 | 1.66 | 8000 | 0.3230 | |
|
| 0.2899 | 1.87 | 9000 | 0.3200 | |
|
| 0.2508 | 2.07 | 10000 | 0.3164 | |
|
| 0.28 | 2.28 | 11000 | 0.3127 | |
|
|
|
|
|
### Dependencies: |
|
- Transformers: `4.31.0` |
|
- Pytorch: `2.0.1+cu118` |
|
- Datasets: `2.14.2` |
|
- Tokenizers: `0.13.3` |
|
|
|
## How to Use |
|
If you wish to use this model to generate Solidity contract code, follow the steps below: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Load the model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen") |
|
model = AutoModelForCausalLM.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen") |
|
|
|
# Input your code prompt |
|
input_text = "contract MyToken" |
|
input_ids = tokenizer.encode(input_text, return_tensors='pt') |
|
sample_output = model.generate(input_ids, do_sample=True, max_length=400, num_return_sequences=1, temperature=0.7) |
|
|
|
# Decode and print the generated text |
|
generated_text = tokenizer.decode(sample_output[0], skip_special_tokens=True) |
|
print(generated_text) |
|
|
|
|