|
# MedEmbed-large-v0.1 ONNX Model |
|
|
|
This repository contains an ONNX version of the MedEmbed-large-v0.1 model, which was originally a SentenceTransformer model. |
|
|
|
## Model Description |
|
|
|
The original MedEmbed-large-v0.1 model is a sentence embedding model specialized for medical text. This ONNX version maintains the same functionality but is optimized for deployment in production environments. |
|
|
|
## ONNX Conversion |
|
|
|
The model was converted to ONNX format using PyTorch's `torch.onnx.export` functionality with ONNX opset version 14. |
|
|
|
## Model Inputs and Outputs |
|
|
|
- **Inputs**: |
|
- `input_ids`: Tensor of shape `[batch_size, sequence_length]` |
|
- `attention_mask`: Tensor of shape `[batch_size, sequence_length]` |
|
|
|
- **Output**: |
|
- `sentence_embedding`: Tensor of shape `[batch_size, embedding_dimension]` |
|
|
|
## Usage with Hugging Face |
|
|
|
```python |
|
import onnxruntime as ort |
|
from transformers import AutoTokenizer |
|
|
|
# Load tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("YOUR_MODEL_PATH") |
|
|
|
# Load ONNX model |
|
onnx_path = "YOUR_MODEL_PATH/MedEmbed-large-v0.1.onnx" |
|
session = ort.InferenceSession(onnx_path) |
|
|
|
# Tokenize input text |
|
text = "Your medical text here" |
|
inputs = tokenizer(text, return_tensors="np", padding=True, truncation=True) |
|
|
|
# Run inference with ONNX model |
|
onnx_inputs = { |
|
"input_ids": inputs["input_ids"], |
|
"attention_mask": inputs["attention_mask"] |
|
} |
|
embeddings = session.run(None, onnx_inputs)[0] |
|
``` |
|
|
|
## Usage with OpenSearch |
|
|
|
This model can be used with OpenSearch's neural search capabilities. Please refer to OpenSearch documentation for details on how to load and use ONNX models for text embedding. |
|
|