Quintu commited on
Commit
881cfe6
·
verified ·
1 Parent(s): 5197297

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -46
README.md CHANGED
@@ -1,46 +1,61 @@
1
- # Quintu/bge-m3-legal_retrieval
2
-
3
- This repository contains the **Quintu/bge-m3-legal_retrieval**, a fine-tuned version of the **bge-m3** model optimized for legal document retrieval tasks. The model is specifically designed to handle the nuances of legal language, enabling accurate and efficient retrieval of relevant legal documents based on semantic similarity.
4
-
5
- ## Model Details
6
-
7
- - **Base Model**: [bge-m3](https://huggingface.co/BAAI/bge-m3-base)
8
- - **Task**: Legal Document Retrieval
9
- - **Fine-tuning Dataset**: Legal documents and associated queries from real-world legal scenarios.
10
- - **Framework**: [Sentence-Transformers](https://www.sbert.net/)
11
-
12
- ## Key Features
13
-
14
- - **Legal Language Understanding**: Optimized to understand legal terms, context, and phrases.
15
- - **Semantic Search**: Retrieves documents based on meaning, not just keywords.
16
- - **High Precision Retrieval**: Tailored for legal professionals and researchers.
17
-
18
- ## How to Use
19
-
20
- ### Load the Model
21
-
22
- You can easily load and use the model with the `SentenceTransformer` library:
23
-
24
- ```python
25
- from sentence_transformers import SentenceTransformer
26
-
27
- # Load the fine-tuned model
28
- model_tuned = SentenceTransformer("Quintu/bge-m3-legal_retrieval")
29
-
30
- # Example usage: Encode queries and documents
31
- queries = ["What are the key legal precedents for intellectual property disputes?"]
32
- documents = [
33
- "This document discusses key precedents in intellectual property law.",
34
- "This document covers legal principles in criminal law."
35
- ]
36
-
37
- # Encode the queries and documents
38
- query_embeddings = model_tuned.encode(queries)
39
- document_embeddings = model_tuned.encode(documents)
40
-
41
- # Compute similarity (example with cosine similarity)
42
- from sklearn.metrics.pairwise import cosine_similarity
43
- similarity_scores = cosine_similarity(query_embeddings, document_embeddings)
44
-
45
- # Output similarity scores
46
- print(similarity_scores)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: sentence-transformers
5
+ tags:
6
+ - legal
7
+ - document-retrieval
8
+ - semantic-search
9
+ - fine-tuned
10
+ license: apache-2.0
11
+ datasets:
12
+ - custom-legal-dataset
13
+ model_name: Quintu/bge-m3-legal_retrieval
14
+ pipeline_tag: feature-extraction
15
+ ---
16
+ # Quintu/bge-m3-legal_retrieval
17
+
18
+ This repository contains the **Quintu/bge-m3-legal_retrieval**, a fine-tuned version of the **bge-m3** model optimized for legal document retrieval tasks. The model is specifically designed to handle the nuances of legal language, enabling accurate and efficient retrieval of relevant legal documents based on semantic similarity.
19
+
20
+ ## Model Details
21
+
22
+ - **Base Model**: [bge-m3](https://huggingface.co/BAAI/bge-m3-base)
23
+ - **Task**: Legal Document Retrieval
24
+ - **Fine-tuning Dataset**: Legal documents and associated queries from real-world legal scenarios.
25
+ - **Framework**: [Sentence-Transformers](https://www.sbert.net/)
26
+
27
+ ## Key Features
28
+
29
+ - **Legal Language Understanding**: Optimized to understand legal terms, context, and phrases.
30
+ - **Semantic Search**: Retrieves documents based on meaning, not just keywords.
31
+ - **High Precision Retrieval**: Tailored for legal professionals and researchers.
32
+
33
+ ## How to Use
34
+
35
+ ### Load the Model
36
+
37
+ You can easily load and use the model with the `SentenceTransformer` library:
38
+
39
+ ```python
40
+ from sentence_transformers import SentenceTransformer
41
+
42
+ # Load the fine-tuned model
43
+ model_tuned = SentenceTransformer("Quintu/bge-m3-legal_retrieval")
44
+
45
+ # Example usage: Encode queries and documents
46
+ queries = ["What are the key legal precedents for intellectual property disputes?"]
47
+ documents = [
48
+ "This document discusses key precedents in intellectual property law.",
49
+ "This document covers legal principles in criminal law."
50
+ ]
51
+
52
+ # Encode the queries and documents
53
+ query_embeddings = model_tuned.encode(queries)
54
+ document_embeddings = model_tuned.encode(documents)
55
+
56
+ # Compute similarity (example with cosine similarity)
57
+ from sklearn.metrics.pairwise import cosine_similarity
58
+ similarity_scores = cosine_similarity(query_embeddings, document_embeddings)
59
+
60
+ # Output similarity scores
61
+ print(similarity_scores)