mvasiliniuc commited on
Commit
c1cab11
·
1 Parent(s): da89df4

Update README file

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - mvasiliniuc/iva-swift-codeint-clean-train
4
+ - mvasiliniuc/iva-swift-codeint-clean-valid
5
+ language:
6
+ - code
7
+ tags:
8
+ - gpt2
9
+ - code
10
+ - swift
11
+ - mobile
12
+ - generation
13
+ ---
14
+ iva-codeint-swift-small GPT-2 is (small version - 239.4M parameters) trained from scratch to obtain results in the text-to-code task tailored for Swift language used
15
+ in native mobile development (iOS).
16
+
17
+ ## Usage
18
+
19
+ ```Python
20
+ from transformers import pipeline
21
+
22
+ pipe = pipeline("text-generation", model="mvasiliniuc/iva-codeint-swift-small")
23
+ outputs = pipe("func triggerNSNotification")
24
+
25
+ ```
26
+
27
+ ### Inference
28
+ ```Python
29
+ API_URL = "https://api-inference.huggingface.co/models/mvasiliniuc/iva-codeint-swift-small"
30
+ headers = {"Authorization": "Bearer <key>"}
31
+ def query(payload):
32
+ response = requests.post(API_URL, headers=headers, json=payload)
33
+ return response.json()
34
+
35
+ output = query({
36
+ "inputs": """
37
+ /*
38
+ A function that gets the current device operating system.
39
+ */
40
+ """
41
+ })
42
+ pprint.pprint(output, compact=True)
43
+ ```
44
+
45
+ ## Training
46
+
47
+ | Config | Value |
48
+ |------|------------------|
49
+ | seq length | 1024 |
50
+ | weight decay | 0.1 |
51
+ | learning rate | 0.0005 |
52
+ | max eval steps | -1 |
53
+ | shuffle buffer | 10000 |
54
+ | max train steps | 150000 |
55
+ | mixed precision | fp16 |
56
+ | num warmup steps | 2000 |
57
+ | train batch size | 5 |
58
+ | valid batch size | 5 |
59
+ | lr scheduler type | cosine |
60
+ | save checkpoint steps | 15000 |
61
+ | gradient checkpointing | false |
62
+ | gradient accumulation steps | 1 |
63
+
64
+ ## Resources
65
+
66
+ Resources used for research:
67
+ * [Training a causal language model from scratch](https://huggingface.co/learn/nlp-course/chapter7/6)
68
+ * [CodeParrot a GPT-2 model (1.5B parameters) trained to generate Python code](https://huggingface.co/codeparrot/codeparrot)