microsoft
/

NatureLM-8x7B-Inst

@@ -15,14 +15,13 @@ Nature Language Model (NatureLM) is a sequence-based science foundation model de
 - Model type: Sequence-based science foundation model
 - Language(s): English
 - License:  MIT License
 # Model sources
 ## Repository:
-We provide four repositories for 1B and 8x7B models, including both base versions and instruction-finetuned versions.
-- https://huggingface.co/microsoft/NatureLM-1B
-- https://huggingface.co/microsoft/NatureLM-1B-Inst
 - https://huggingface.co/microsoft/NatureLM-8x7B
 - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
@@ -43,6 +42,12 @@ NatureLM currently not ready to use in clinical applications, without rigorous e
 NatureLM is not a general-purpose language model and is not designed or optimized to perform general tasks like text summarization or Q&A.
 ### Use by Non-Experts
 NatureLM outputs scientific entities (e.g., molecules, proteins, materials) and requires expert interpretation, validation, and analysis. It is not intended for use by non-experts or individuals without the necessary domain knowledge to evaluate and verify its outputs. Outputs, such as small molecule inhibitors for target proteins, require rigorous validation to ensure safety and efficacy. Misuse by non-experts may lead to the design of inactive or suboptimal compounds, resulting in wasted resources and potentially delaying critical research or development efforts.
 ## Risks and limitations
 NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
@@ -60,17 +65,10 @@ Preprocessing
 The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
 ## Training hyperparameters
--	Learning Rate:
-	-	1B model: 1×10<sup>−4</sup>
-	-	8x7B model: 2×10<sup>−4</sup>
--	Batch Size (Sentences):
-	-	1B model: 4096
-	-	8x7B model: 1536
--	Context Length (Tokens):
-	- All models: 8192
--   GPU Number (H100):
-	-	1B model: 64
-	-   8x7B model: 256
 ## Speeds, sizes, times

 - Model type: Sequence-based science foundation model
 - Language(s): English
 - License:  MIT License
+- Finetuned from model: one version of the model is finetuned from Mixtral-8x7B-v0.1
 # Model sources
 ## Repository:
+We provide two repositories for 8x7B models, including both base versions and instruction-finetuned versions.
 - https://huggingface.co/microsoft/NatureLM-8x7B
 - https://huggingface.co/microsoft/NatureLM-8x7B-Inst
 NatureLM is not a general-purpose language model and is not designed or optimized to perform general tasks like text summarization or Q&A.
 ### Use by Non-Experts
 NatureLM outputs scientific entities (e.g., molecules, proteins, materials) and requires expert interpretation, validation, and analysis. It is not intended for use by non-experts or individuals without the necessary domain knowledge to evaluate and verify its outputs. Outputs, such as small molecule inhibitors for target proteins, require rigorous validation to ensure safety and efficacy. Misuse by non-experts may lead to the design of inactive or suboptimal compounds, resulting in wasted resources and potentially delaying critical research or development efforts.
+### CBRN Applications (Chemical, Biological, Radiological, and Nuclear)
+NatureLM is not intended for the design, development, or optimization of agents or materials for harmful purposes, including but not limited to weapons of mass destruction, bioterrorism, or other malicious uses.
+### Unethical or Harmful Applications
+The use of NatureLM must align with ethical research practices. It is not intended for tasks that could cause harm to individuals, communities, or the environment.
 ## Risks and limitations
 NatureLM may not always generate compounds or proteins precisely aligned with user instructions. Users are advised to apply their own adaptive filters before proceeding. Users are responsible for verification of model outputs and decision-making.
 The training procedure involves two stages: Stage 1 focuses on training newly introduced tokens while freezing existing model parameters. Stage 2 involves joint optimization of both new and existing parameters to enhance overall performance.
 ## Training hyperparameters
+-	Learning Rate: 2×10<sup>−4</sup>
+-	Batch Size (Sentences): 8x7B model: 1536
+-	Context Length (Tokens): 8192
+-   GPU Number (H100): 8x7B model: 256
 ## Speeds, sizes, times