agentlans
/

deberta-v3-base-quality-v3

Text Classification

Generated from Trainer

Model card Files Files and versions

deberta-v3-base-quality-v3 / README.md

agentlans's picture

Update README.md

9c56997 verified about 1 month ago

|

history blame contribute delete

3.48 kB

	---
	library_name: transformers
	license: mit
	base_model: agentlans/deberta-v3-base-zyda-2-v2
	tags:
	- generated_from_trainer
	model-index:
	- name: deberta-v3-base-zyda-2-v2-text-quality-v3
	results: []
	datasets:
	- agentlans/text-quality-v3
	language:
	- en
	---
	# DeBERTa Text Quality Model

	This model rates the quality of English text for AI learning. Input a text string, and it outputs a numeric quality score reflecting overall informativeness and usefulness.

	## Performance

	On the evaluation set, it achieved:
	- Loss: 0.1408
	- MSE: 0.1408
	- Combined Score: 0.1408
	- Tokens processed during training: 102,398,720

	## Usage Example

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_name = "agentlans/deberta-v3-base-quality-v3"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name).to("cuda" if torch.cuda.is_available() else "cpu")

	# Higher scores indicate higher text quality.
	# The sign of the score has no particular meaning.
	# For example, a negative score doesn't necessarily mean that the text is low quality.
	def quality(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(model.device)
	with torch.no_grad():
	score = model(**inputs).logits.squeeze().cpu().item()
	return score

	print(quality("Your text here."))
	```

	## Limitations

	- Works best on non-fiction and general-purpose texts.
	- Scores give an overall quality estimate but don’t explain why.
	- The model is large and slow; for faster results with similar accuracy, try [agentlans/GIST-all-MiniLM-L6-v2-quality-v3](https://huggingface.co/agentlans/GIST-all-MiniLM-L6-v2-quality-v3).
	- Check for biases and suitability before use.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 10.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Mse \| Combined Score \| Input Tokens Seen \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:------:\|:--------------:\|:-----------------:\|
	\| 0.1635 \| 1.0 \| 10000 \| 0.1854 \| 0.1854 \| 0.1854 \| 10239872 \|
	\| 0.1241 \| 2.0 \| 20000 \| 0.1408 \| 0.1408 \| 0.1408 \| 20479744 \|
	\| 0.0882 \| 3.0 \| 30000 \| 0.1747 \| 0.1747 \| 0.1747 \| 30719616 \|
	\| 0.054 \| 4.0 \| 40000 \| 0.1528 \| 0.1528 \| 0.1528 \| 40959488 \|
	\| 0.0372 \| 5.0 \| 50000 \| 0.1480 \| 0.1480 \| 0.1480 \| 51199360 \|
	\| 0.0263 \| 6.0 \| 60000 \| 0.1524 \| 0.1524 \| 0.1524 \| 61439232 \|
	\| 0.0203 \| 7.0 \| 70000 \| 0.1495 \| 0.1495 \| 0.1495 \| 71679104 \|
	\| 0.0135 \| 8.0 \| 80000 \| 0.1482 \| 0.1482 \| 0.1482 \| 81918976 \|
	\| 0.0098 \| 9.0 \| 90000 \| 0.1450 \| 0.1450 \| 0.1450 \| 92158848 \|
	\| 0.0073 \| 10.0 \| 100000 \| 0.1453 \| 0.1453 \| 0.1453 \| 102398720 \|


	### Framework versions

	- Transformers 4.51.3
	- Pytorch 2.6.0+cu124
	- Datasets 3.2.0
	- Tokenizers 0.21.0