Commit
•
2cf2cd7
1
Parent(s):
5c6644f
add section on Argilla integration
Browse files- README.md +10 -1
- assets/argilla.png +0 -0
README.md
CHANGED
@@ -80,7 +80,10 @@ pip install synthetic-dataset-generator
|
|
80 |
|
81 |
### Environment Variables
|
82 |
|
83 |
-
- `HF_TOKEN`: Your Hugging Face token to push your datasets to the Hugging Face Hub and run Inference Endpoints Requests. You can get one [here](https://huggingface.co/settings/tokens/new?ownUserPermissions=repo.content.read&ownUserPermissions=repo.write&globalPermissions=inference.serverless.write&tokenType=fineGrained).
|
|
|
|
|
|
|
84 |
- `ARGILLA_API_KEY`: Your Argilla API key to push your datasets to Argilla.
|
85 |
- `ARGILLA_API_URL`: Your Argilla API URL to push your datasets to Argilla.
|
86 |
|
@@ -90,6 +93,12 @@ pip install synthetic-dataset-generator
|
|
90 |
python app.py
|
91 |
```
|
92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
## Custom synthetic data generation?
|
94 |
|
95 |
Each pipeline is based on distilabel, so you can easily change the LLM or the pipeline steps.
|
|
|
80 |
|
81 |
### Environment Variables
|
82 |
|
83 |
+
- `HF_TOKEN`: Your Hugging Face token to push your datasets to the Hugging Face Hub and run *Free* Inference Endpoints Requests. You can get one [here](https://huggingface.co/settings/tokens/new?ownUserPermissions=repo.content.read&ownUserPermissions=repo.write&globalPermissions=inference.serverless.write&tokenType=fineGrained).
|
84 |
+
|
85 |
+
Optionally, you can also push your datasets to Argilla for further curation by setting the following environment variables:
|
86 |
+
|
87 |
- `ARGILLA_API_KEY`: Your Argilla API key to push your datasets to Argilla.
|
88 |
- `ARGILLA_API_URL`: Your Argilla API URL to push your datasets to Argilla.
|
89 |
|
|
|
93 |
python app.py
|
94 |
```
|
95 |
|
96 |
+
### Argilla integration
|
97 |
+
|
98 |
+
Argilla is a open source tool for data curation. It allows you to annotate and review datasets, and push curated datasets to the Hugging Face Hub. You can easily get started with Argilla by following the [quickstart guide](https://docs.argilla.io/latest/getting_started/quickstart/).
|
99 |
+
|
100 |
+
![Argilla integration](https://huggingface.co/spaces/argilla/synthetic-data-generator/resolve/main/assets/argilla.png)
|
101 |
+
|
102 |
## Custom synthetic data generation?
|
103 |
|
104 |
Each pipeline is based on distilabel, so you can easily change the LLM or the pipeline steps.
|
assets/argilla.png
ADDED