minor
Browse files
README.md
CHANGED
@@ -10,8 +10,6 @@ Qwen2-7B-ReLU is a variant of Qwen2-7B that replaces the SiLU/Swish activation f
|
|
10 |
- Replaces SiLU/Swish activation function with dReLU
|
11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
13 |
-
I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
|
14 |
-
|
15 |
|
16 |
## Technical Details
|
17 |
|
|
|
10 |
- Replaces SiLU/Swish activation function with dReLU
|
11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
|
|
|
|
13 |
|
14 |
## Technical Details
|
15 |
|