baidu
/

ERNIE-4.5-VL-424B-A47B-PT

Image-Text-to-Text

Transformers

Safetensors

Model card Files Files and versions

xet

Community

SFLY5

sunzhongkai588 commited on 23 days ago

Commit

711fbdc

verified ·

1 Parent(s): 1e5b28d

Update README.md (#1)

Browse files

- Update README.md (2b3c45757d4c7686d115fda501bddd5383e5e116)

Co-authored-by: Suen.ZK <[email protected]>

Files changed (1) hide show

README.md +3 -54

README.md CHANGED Viewed

@@ -32,6 +32,9 @@ library_name: transformers
 # ERNIE-4.5-VL-424B-A47B
 ## ERNIE 4.5 Highlights
 The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
@@ -61,60 +64,6 @@ ERNIE-4.5-VL-424B-A47B is a multimodal MoE Chat model based on ERNIE-4.5-VL-424B
 ## Quickstart
-### FastDeploy Inference
-Quickly deploy services using FastDeploy as shown below. For more detailed usage, refer to the [FastDeploy GitHub Repository](https://github.com/PaddlePaddle/FastDeploy).
-**Note**: 80GB x 8 GPU resources are required. The `--quantization` parameter supports specifying `wint4` or `wint8` for deployment with 4-bit or 8-bit quantization, respectively.
-```bash
-python -m fastdeploy.entrypoints.openai.api_server \
-       --model baidu/ERNIE-4.5-VL-424B-A47B-Paddle \
-       --port 8180 \
-       --metrics-port 8181 \
-       --engine-worker-queue-port 8182 \
-       --tensor-parallel-size 8 \
-       --quantization wint4 \
-       --max-model-len 32768 \
-       --enable-mm \
-       --reasoning-parser ernie-45-vl \
-       --max-num-seqs 32
-```
-The ERNIE-4.5-VL model supports enabling or disabling thinking mode through request parameters.
-#### Enable Thinking Mode
-```bash
-curl -X POST "http://0.0.0.0:8180/v1/chat/completions" \
--H "Content-Type: application/json" \
--d '{
-  "messages": [
-    {"role": "user", "content": [
-      {"type": "image_url", "image_url": {"url": "https://paddlenlp.bj.bcebos.com/datasets/paddlemix/demo_images/example2.jpg"}},
-      {"type": "text", "text": "Descript this image"}
-    ]}
-  ],
-  "metadata": {"enable_thinking": true}
-}'
-```
-#### Disable Thinking Mode
-```bash
-curl -X POST "http://0.0.0.0:8180/v1/chat/completions" \
--H "Content-Type: application/json" \
--d '{
-  "messages": [
-    {"role": "user", "content": [
-      {"type": "image_url", "image_url": {"url": "https://paddlenlp.bj.bcebos.com/datasets/paddlemix/demo_images/example2.jpg"}},
-      {"type": "text", "text": "Descript this image"}
-    ]}
-  ],
-  "metadata": {"enable_thinking": false}
-}'
-```
 ### vLLM inference
 We are working with the community to fully support ERNIE4.5 models, stay tuned.

 # ERNIE-4.5-VL-424B-A47B
+> [!NOTE]
+> Note: "**-Paddle**" models use [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) weights, while "**-PT**" models use Transformer-style PyTorch weights.
 ## ERNIE 4.5 Highlights
 The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
 ## Quickstart
 ### vLLM inference
 We are working with the community to fully support ERNIE4.5 models, stay tuned.