dtaivpp commited on
Commit
99aaace
·
verified ·
1 Parent(s): 71641b0

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +149 -1
README.md CHANGED
@@ -1,3 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # MedEmbed-large-v0.1 ONNX Model
2
 
3
  This repository contains an ONNX version of the MedEmbed-large-v0.1 model, which was originally a SentenceTransformer model.
@@ -6,6 +41,8 @@ This repository contains an ONNX version of the MedEmbed-large-v0.1 model, which
6
 
7
  The original MedEmbed-large-v0.1 model is a sentence embedding model specialized for medical text. This ONNX version maintains the same functionality but is optimized for deployment in production environments.
8
 
 
 
9
  ## ONNX Conversion
10
 
11
  The model was converted to ONNX format using PyTorch's `torch.onnx.export` functionality with ONNX opset version 14.
@@ -46,4 +83,115 @@ embeddings = session.run(None, onnx_inputs)[0]
46
 
47
  ## Usage with OpenSearch
48
 
49
- This model can be used with OpenSearch's neural search capabilities. Please refer to OpenSearch documentation for details on how to load and use ONNX models for text embedding.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - medical
6
+ - sentence-transformers
7
+ - text-embedding
8
+ - sentence-similarity
9
+ - onnx
10
+ - semantic-search
11
+ - opensearch
12
+ - healthcare
13
+ - medical-embeddings
14
+ datasets:
15
+ - abhinand/MedEmbed-corpus
16
+ metrics:
17
+ - cosine-similarity
18
+ library_name: sentence-transformers
19
+ pipeline_tag: sentence-similarity
20
+ model-index:
21
+ - name: MedEmbed-large-v0.1-onnx
22
+ results:
23
+ - task:
24
+ type: Sentence Similarity
25
+ name: Semantic Retrieval
26
+ dataset:
27
+ type: abhinand/MedEmbed-corpus
28
+ name: MedEmbed corpus
29
+ metrics:
30
+ - type: cosine-similarity
31
+ value: N/A # Replace with actual value if available
32
+ base_model: abhinand/MedEmbed-Large-v0.1
33
+ inference: true
34
+ ---
35
+
36
  # MedEmbed-large-v0.1 ONNX Model
37
 
38
  This repository contains an ONNX version of the MedEmbed-large-v0.1 model, which was originally a SentenceTransformer model.
 
41
 
42
  The original MedEmbed-large-v0.1 model is a sentence embedding model specialized for medical text. This ONNX version maintains the same functionality but is optimized for deployment in production environments.
43
 
44
+ This model is a derivative of [abhinand/MedEmbed-Large-v0.1](https://huggingface.co/abhinand/MedEmbed-Large-v0.1), which itself is a fine-tune of [abhinand/MedEmbed-base-v0.1](https://huggingface.co/abhinand/MedEmbed-base-v0.1).
45
+
46
  ## ONNX Conversion
47
 
48
  The model was converted to ONNX format using PyTorch's `torch.onnx.export` functionality with ONNX opset version 14.
 
83
 
84
  ## Usage with OpenSearch
85
 
86
+ This model can be integrated with OpenSearch for neural search capabilities. Here's how to set it up:
87
+
88
+ ### 1. Upload the model to OpenSearch
89
+
90
+ ```bash
91
+ # Create a zip file containing your model files
92
+ zip -r medembedlarge.zip MedEmbed-large-v0.1.onnx config.json tokenizer_config.json tokenizer.json vocab.txt special_tokens_map.json
93
+
94
+ # Upload the model using the OpenSearch REST API
95
+ curl -XPUT "https://your-opensearch-endpoint/_plugins/_ml/models/medembedlarge" \
96
+ -H "Content-Type: application/json" \
97
+ -d '{
98
+ "name": "medembedlarge",
99
+ "version": "1.0.0",
100
+ "model_format": "ONNX",
101
+ "model_config": {
102
+ "model_type": "bert",
103
+ "embedding_dimension": 768,
104
+ "framework_type": "sentence_transformers"
105
+ }
106
+ }' -u "admin:admin"
107
+
108
+ # Upload the model file
109
+ curl -XPOST "https://your-opensearch-endpoint/_plugins/_ml/models/medembedlarge/_upload" \
110
+ -H "Content-Type: multipart/form-data" \
111
+ -F "[email protected]" -u "admin:admin"
112
+ ```
113
+
114
+ ### 2. Deploy the model
115
+
116
+ ```bash
117
+ curl -XPOST "https://your-opensearch-endpoint/_plugins/_ml/models/medembedlarge/_deploy" \
118
+ -H "Content-Type: application/json" -u "admin:admin"
119
+ ```
120
+
121
+ ### 3. Create a neural search pipeline
122
+
123
+ ```bash
124
+ curl -XPUT "https://your-opensearch-endpoint/_plugins/_ml/pipelines/medembedlarge-pipeline" \
125
+ -H "Content-Type: application/json" \
126
+ -d '{
127
+ "description": "Neural search pipeline for medical text",
128
+ "processors": [
129
+ {
130
+ "text_embedding": {
131
+ "model_id": "medembedlarge",
132
+ "field_map": {
133
+ "text_field": "text_embedding"
134
+ }
135
+ }
136
+ }
137
+ ]
138
+ }' -u "admin:admin"
139
+ ```
140
+
141
+ ### 4. Create an index with embedding field
142
+
143
+ ```bash
144
+ curl -XPUT "https://your-opensearch-endpoint/medical-documents" \
145
+ -H "Content-Type: application/json" \
146
+ -d '{
147
+ "settings": {
148
+ "index.plugins.search_pipeline.default": "medembedlarge-pipeline"
149
+ },
150
+ "mappings": {
151
+ "properties": {
152
+ "text_field": {
153
+ "type": "text"
154
+ },
155
+ "text_embedding": {
156
+ "type": "knn_vector",
157
+ "dimension": 768,
158
+ "method": {
159
+ "name": "hnsw",
160
+ "space_type": "cosinesimil",
161
+ "engine": "nmslib"
162
+ }
163
+ }
164
+ }
165
+ }
166
+ }' -u "admin:admin"
167
+ ```
168
+
169
+ ### 5. Index documents with the neural search pipeline
170
+
171
+ ```bash
172
+ curl -XPOST "https://your-opensearch-endpoint/medical-documents/_doc" \
173
+ -H "Content-Type: application/json" \
174
+ -d '{
175
+ "text_field": "Patient presented with symptoms of hypertension and diabetes."
176
+ }' -u "admin:admin"
177
+ ```
178
+
179
+ ### 6. Perform a neural search query
180
+
181
+ ```bash
182
+ curl -XPOST "https://your-opensearch-endpoint/medical-documents/_search" \
183
+ -H "Content-Type: application/json" \
184
+ -d '{
185
+ "query": {
186
+ "neural": {
187
+ "text_embedding": {
188
+ "query_text": "hypertension treatment options",
189
+ "model_id": "medembedlarge",
190
+ "k": 10
191
+ }
192
+ }
193
+ }
194
+ }' -u "admin:admin"
195
+ ```
196
+
197
+ Note: Replace "https://your-opensearch-endpoint" with your actual OpenSearch endpoint, and adjust authentication credentials as needed for your environment.