microsoft
/

NatureLM-8x7B

Model card Files Files and versions

thermal666 commited on Jun 20

Commit

e1150df

·

verified ·

1 Parent(s): dcae4e9

Update README.md

Files changed (1) hide show

README.md +5 -15

README.md CHANGED Viewed

@@ -20,10 +20,8 @@ Nature Language Model (NatureLM) is a sequence-based science foundation model de
 # Model sources
 ## Repository:
-We provide four repositories for 1B and 8x7B models, including both base versions and instruction-finetuned versions.
-- https://huggingface.co/microsoft/NatureLM-1B
-- https://huggingface.co/microsoft/NatureLM-1B-Inst
 - https://huggingface.co/microsoft/NatureLM-8x7B
 - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
@@ -51,7 +49,6 @@ The use of NatureLM must align with ethical research practices. It is not intend
 ## Risks and limitations
 NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
 NatureLM was designed and tested using the English language. Performance in other languages may vary and should be assessed by someone who is both an expert in the expected outputs and a native speaker of that language.
@@ -68,17 +65,10 @@ Preprocessing
 The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
 ## Training hyperparameters
--	Learning Rate:
-	-	1B model: 1×10<sup>−4</sup>
-	-	8x7B model: 2×10<sup>−4</sup>
--	Batch Size (Sentences):
-	-	1B model: 4096
-	-	8x7B model: 1536
--	Context Length (Tokens):
-	- All models: 8192
--   GPU Number (H100):
-	-	1B model: 64
-	-   8x7B model: 256
 ## Speeds, sizes, times

 # Model sources
 ## Repository:
+We provide two repositories for 8x7B models, including both base versions and instruction-finetuned versions.
 - https://huggingface.co/microsoft/NatureLM-8x7B
 - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
 ## Risks and limitations
 NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
 NatureLM was designed and tested using the English language. Performance in other languages may vary and should be assessed by someone who is both an expert in the expected outputs and a native speaker of that language.
 The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
 ## Training hyperparameters
+-	Learning Rate: 2×10<sup>−4</sup>
+-	Batch Size (Sentences): 8x7B model: 1536
+-	Context Length (Tokens): 8192
+-   GPU Number (H100): 8x7B model: 256
 ## Speeds, sizes, times