Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,14 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
# chiliground-base-modernbert-v1
|
5 |
|
@@ -9,8 +18,6 @@ A sentence classification model for extracting relevant spans from documents bas
|
|
9 |
- Base model: answerdotai/ModernBERT-base
|
10 |
- Hidden dimension: 768
|
11 |
- Number of labels: 2
|
12 |
-
- Best validation F1: 0.7038
|
13 |
-
- Saved on: 2025-03-29 19:17:14
|
14 |
|
15 |
## Usage
|
16 |
|
@@ -29,19 +36,34 @@ extractor = ModelSpanExtractor(
|
|
29 |
# Create documents
|
30 |
documents = [
|
31 |
Document(
|
32 |
-
content="
|
33 |
-
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
]
|
36 |
|
|
|
37 |
# Extract relevant spans
|
38 |
-
question = "What
|
39 |
results = extractor.extract_spans(question, documents)
|
40 |
|
41 |
# Print the results
|
42 |
for doc_content, spans in results.items():
|
43 |
for span in spans:
|
44 |
-
print(
|
45 |
```
|
46 |
|
47 |
## Training Data
|
@@ -52,4 +74,4 @@ This model was trained on a QA dataset to classify sentences as relevant or not
|
|
52 |
|
53 |
- The model works at the sentence level and may miss relevant spans that cross sentence boundaries
|
54 |
- Performance depends on the quality and relevance of the training data
|
55 |
-
- The model is designed for English text only
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- rungalileo/ragbench
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
metrics:
|
8 |
+
- f1
|
9 |
+
base_model:
|
10 |
+
- answerdotai/ModernBERT-base
|
11 |
+
pipeline_tag: text-classification
|
12 |
---
|
13 |
# chiliground-base-modernbert-v1
|
14 |
|
|
|
18 |
- Base model: answerdotai/ModernBERT-base
|
19 |
- Hidden dimension: 768
|
20 |
- Number of labels: 2
|
|
|
|
|
21 |
|
22 |
## Usage
|
23 |
|
|
|
36 |
# Create documents
|
37 |
documents = [
|
38 |
Document(
|
39 |
+
content="""
|
40 |
+
Climate change is a significant and lasting change in the statistical distribution of weather patterns.
|
41 |
+
Global warming is the observed increase in the average temperature of the Earth's atmosphere and oceans.
|
42 |
+
Greenhouse gases include water vapor, carbon dioxide, methane, nitrous oxide, and ozone.
|
43 |
+
Human activities since the beginning of the Industrial Revolution have increased greenhouse gas levels.
|
44 |
+
""",
|
45 |
+
metadata={"source": "example_doc_1", "id": "climate_1"},
|
46 |
+
),
|
47 |
+
Document(
|
48 |
+
content="""
|
49 |
+
Renewable energy comes from sources that are naturally replenished on a human timescale.
|
50 |
+
Solar power is the conversion of energy from sunlight into electricity.
|
51 |
+
Wind power is the use of wind to provide mechanical power or electricity.
|
52 |
+
Hydropower is electricity generated from the energy of falling water.
|
53 |
+
""",
|
54 |
+
metadata={"source": "example_doc_2", "id": "energy_1"},
|
55 |
+
),
|
56 |
]
|
57 |
|
58 |
+
|
59 |
# Extract relevant spans
|
60 |
+
question = "What causes climate change?""
|
61 |
results = extractor.extract_spans(question, documents)
|
62 |
|
63 |
# Print the results
|
64 |
for doc_content, spans in results.items():
|
65 |
for span in spans:
|
66 |
+
print(span)
|
67 |
```
|
68 |
|
69 |
## Training Data
|
|
|
74 |
|
75 |
- The model works at the sentence level and may miss relevant spans that cross sentence boundaries
|
76 |
- Performance depends on the quality and relevance of the training data
|
77 |
+
- The model is designed for English text only
|