thermal666 commited on
Commit
e1150df
·
verified ·
1 Parent(s): dcae4e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -15
README.md CHANGED
@@ -20,10 +20,8 @@ Nature Language Model (NatureLM) is a sequence-based science foundation model de
20
 
21
  # Model sources
22
  ## Repository:
23
- We provide four repositories for 1B and 8x7B models, including both base versions and instruction-finetuned versions.
24
 
25
- - https://huggingface.co/microsoft/NatureLM-1B
26
- - https://huggingface.co/microsoft/NatureLM-1B-Inst
27
  - https://huggingface.co/microsoft/NatureLM-8x7B
28
  - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
29
 
@@ -51,7 +49,6 @@ The use of NatureLM must align with ethical research practices. It is not intend
51
 
52
 
53
 
54
-
55
  ## Risks and limitations
56
  NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
57
  NatureLM was designed and tested using the English language. Performance in other languages may vary and should be assessed by someone who is both an expert in the expected outputs and a native speaker of that language.
@@ -68,17 +65,10 @@ Preprocessing
68
  The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
69
 
70
  ## Training hyperparameters
71
- - Learning Rate:
72
- - 1B model: 1×10<sup>−4</sup>
73
- - 8x7B model: 2×10<sup>−4</sup>
74
- - Batch Size (Sentences):
75
- - 1B model: 4096
76
- - 8x7B model: 1536
77
- - Context Length (Tokens):
78
- - All models: 8192
79
- - GPU Number (H100):
80
- - 1B model: 64
81
- - 8x7B model: 256
82
 
83
  ## Speeds, sizes, times
84
 
 
20
 
21
  # Model sources
22
  ## Repository:
23
+ We provide two repositories for 8x7B models, including both base versions and instruction-finetuned versions.
24
 
 
 
25
  - https://huggingface.co/microsoft/NatureLM-8x7B
26
  - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
27
 
 
49
 
50
 
51
 
 
52
  ## Risks and limitations
53
  NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
54
  NatureLM was designed and tested using the English language. Performance in other languages may vary and should be assessed by someone who is both an expert in the expected outputs and a native speaker of that language.
 
65
  The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
66
 
67
  ## Training hyperparameters
68
+ - Learning Rate: 2×10<sup>−4</sup>
69
+ - Batch Size (Sentences): 8x7B model: 1536
70
+ - Context Length (Tokens): 8192
71
+ - GPU Number (H100): 8x7B model: 256
 
 
 
 
 
 
 
72
 
73
  ## Speeds, sizes, times
74