--- license: cc-by-nc-4.0 tags: - moe - frankenmoe - merge - mergekit - lazymergekit base_model: - mlabonne/NeuralDaredevil-7B - BioMistral/BioMistral-7B - mistralai/Mathstral-7B-v0.1 - FPHam/Writing_Partner_Mistral_7B library_name: transformers pipeline_tag: text-generation --- # EduMixtral-4x7B EduMixtral-4x7B is an experimental model that combines different educational focused language models intended for downstream human/ai student/teacher application research. Intended to cover: general knowledge, medical field, math, and writing assistance. ## 🤏 Models Merged EduMixtral-4x7B is a Mixture of Experts (MoE) made with the following models using [Mergekit](https://github.com/arcee-ai/mergekit): * [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B) <- Base Model * [BioMistral/BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B) * [mistralai/Mathstral-7B-v0.1](https://huggingface.co/mistralai/Mathstral-7B-v0.1) * [FPHam/Writing_Partner_Mistral_7B](https://huggingface.co/FPHam/Writing_Partner_Mistral_7B) ## 🧩 Configuration ```yaml base_model: mlabonne/NeuralDaredevil-7B gate_mode: hidden experts: - source_model: mlabonne/NeuralDaredevil-7B positive_prompts: - "hello" - "help" - "question" - "explain" - "information" - source_model: BioMistral/BioMistral-7B positive_prompts: - "medical" - "health" - "biomedical" - "clinical" - "anatomy" - source_model: mistralai/Mathstral-7B-v0.1 positive_prompts: - "math" - "calculation" - "equation" - "geometry" - "algebra" - source_model: FPHam/Writing_Partner_Mistral_7B positive_prompts: - "writing" - "creative process" - "story structure" - "character development" - "plot" ``` ## 💻 Usage It is reccomended to load in 8bit or 4bit quantization ```python from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig # Load the tokenizer and model tokenizer = AutoTokenizer.from_pretrained("AdamLucek/EduMixtral-4x7B") model = AutoModelForCausalLM.from_pretrained( "AdamLucek/EduMixtral-4x7B", device_map="cuda", quantization_config=BitsAndBytesConfig(load_in_8bit=True) ) # Prepare the input text input_text = "Math problem: Xiaoli reads a 240-page story book. She reads (1/8) of the whole book on the first day and (1/5) of the whole book on the second day. How many pages did she read in total in two days?" input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") # Generate the output with specified parameters outputs = model.generate( **input_ids, max_new_tokens=256, num_return_sequences=1 ) # Decode and print the generated text print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` **Output:** >Solution: >To find the total number of pages Xiaoli read in two days, we need to add the number of pages she read on the first day and the second day. >On the first day, Xiaoli read 1/8 of the book. Since the book has 240 pages, the number of pages she read on the first day is: >\[ \frac{1}{8} \times 240 = 30 \text{ pages} \] >On the second day, Xiaoli read 1/5 of the book. The number of pages she read on the second day is: >\[ \frac{1}{5} \times 240 = 48 \text{ pages} \] >To find the total number of pages she read in two days, we add the pages she read on the first day and the second day: >\[ 30 \text{ pages} + 48 \text{ pages} = 78 \text{ pages} \] >Therefore, Xiaoli read a total of 78 pages in two days. >Final answer: Xiaoli read 78 pages in total