File size: 2,163 Bytes
9942625
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d9d44e
9942625
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
base_model:
- ValiantLabs/Qwen3-4B-ShiningValiant3
- ValiantLabs/Qwen3-4B-Esper3
- Qwen/Qwen3-4B
library_name: transformers
tags:
- mergekit
- merge
- qwen
- qwen-3
- qwen-3-4b
- 4b
- reasoning
- code
- code-reasoning
- code-instruct
- python
- javascript
- dev-ops
- jenkins
- terraform
- scripting
- powershell
- azure
- aws
- gcp
- cloud
- science
- science-reasoning
- physics
- biology
- chemistry
- earth-science
- astronomy
- machine-learning
- artificial-intelligence
- compsci
- computer-science
- information-theory
- ML-Ops
- math
- cuda
- deep-learning
- transformers
- agentic
- LLM
- neuromorphic
- self-improvement
- complex-systems
- cognition
- linguistics
- philosophy
- logic
- epistemology
- simulation
- game-theory
- knowledge-management
- creativity
- problem-solving
- architect
- engineer
- developer
- creative
- analytical
- expert
- rationality
- conversational
- chat
- instruct
datasets:
- sequelbox/Celestia3-DeepSeek-R1-0528
- sequelbox/Mitakihara-DeepSeek-R1-0528
- sequelbox/Titanium2.1-DeepSeek-R1
- sequelbox/Tachibana2-DeepSeek-R1
- sequelbox/Raiden-DeepSeek-R1

---
# PlumEsper

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit), combining the specialty and general reasoning skills of Esper 3 4b and Shining Valiant 3 4b.

## Merge Details
### Merge Method

This model was merged using the [DELLA](https://arxiv.org/abs/2406.11617) merge method using [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) as a base.

### Models Merged

The following models were included in the merge:
* [ValiantLabs/Qwen3-4B-ShiningValiant3](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3)
* [ValiantLabs/Qwen3-4B-Esper3](https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
merge_method: della
dtype: bfloat16
parameters:
  normalize: true
models:
  - model: ValiantLabs/Qwen3-4B-Esper3
    parameters:
      density: 0.5
      weight: 0.3
  - model: ValiantLabs/Qwen3-4B-ShiningValiant3
    parameters:
      density: 0.5
      weight: 0.3
base_model: Qwen/Qwen3-4B

```