Safetensors
qwen2
yixinsong commited on
Commit
5bf22c9
·
1 Parent(s): 9260b43
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -7,7 +7,7 @@ Qwen2-7B-ReLU is a variant of Qwen2-7B that replaces the SiLU/Swish activation f
7
 
8
  ## Key Features
9
 
10
- - Replaces SiLU/Swish activation function with ReLU
11
  - Maintains comparable or even better performance with the original Qwen2-7B
12
  - Significantly increases activation sparsity, enabling further optimization and compression
13
  I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
 
7
 
8
  ## Key Features
9
 
10
+ - Replaces SiLU/Swish activation function with dReLU
11
  - Maintains comparable or even better performance with the original Qwen2-7B
12
  - Significantly increases activation sparsity, enabling further optimization and compression
13
  I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of: