minor
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ Qwen2-7B-ReLU is a variant of Qwen2-7B that replaces the SiLU/Swish activation f
|
|
7 |
|
8 |
## Key Features
|
9 |
|
10 |
-
- Replaces SiLU/Swish activation function with
|
11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
13 |
I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
|
|
|
7 |
|
8 |
## Key Features
|
9 |
|
10 |
+
- Replaces SiLU/Swish activation function with dReLU
|
11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
13 |
I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
|