--- license: apache-2.0 datasets: - roseteromeo56/rosete-romeo - bombastictranz/romeo-rosete language: - en metrics: - accuracy - bertscore - bleu - bleurt - brier_score - cer base_model: - roseteromeo56/romeo-rosete new_version: bombastictranz/romeo-rosete pipeline_tag: token-classification library_name: adapter-transformers tags: - chemistry - biology - legal - finance - music - art - code - climate - medical - text-generation-inference - merge --- # Model Card for Model ID git clone git@github.com:huggingface/lm-evaluation-harness.git cd lm-evaluation-harness git checkout main pip install -e . lm-eval --model_args="pretrained=,revision=,dtype=" --tasks=leaderboard --batch_size=auto --output_path= - **Developed by:** [More Information Needed] curl https://router.huggingface.co/novita/v3/openai/chat/completions \ -H "Authorization: Bearer $HF_TOKEN" \ -H 'Content-Type: application/json' \ -d '{ "messages": [ { "role": "user", "content": "How many G in huggingface?" } ], "model": "deepseek/deepseek-v3-0324", "stream": false }' - **Funded by [optional]:** [More Information Needed] python -c "import evaluate; print(evaluate.load('exact_match').compute(references=['hello'], predictions=['hello']))" - **Shared by [optional]:** [More Information Needed] {'exact_match': 1.0} - **Model type:** [More Information Needed] git clone https://github.com/huggingface/evaluate.git cd evaluate pip install -e . - **Language(s) (NLP):** [More Information Needed] precision_metric = evaluate.load("precision") results = precision_metric.compute(references=[0, 1], predictions=[0, 1]) print(results) - **License:** [More Information Needed] from evaluate import load squad_metric = load("squad") predictions = [{'prediction_text': '1976', 'id': '56e10a3be3433e1400422b22'}] references = [{'answers': {'answer_start': [97], 'text': ['1976']}, 'id': '56e10a3be3433e1400422b22'}] results = squad_metric.compute(predictions=predictions, references=references) results - **Finetuned from model [optional]:** [More Information Needed] from datasets import load_dataset from evaluate import evaluator from transformers import AutoModelForSequenceClassification, pipeline data = load_dataset("imdb", split="test").shuffle(seed=42).select(range(1000)) task_evaluator = evaluator("text-classification") # 1. Pass a model name or path eval_results = task_evaluator.compute( model_or_pipeline="lvwerra/distilbert-imdb", data=data, label_mapping={"NEGATIVE": 0, "POSITIVE": 1} ) # 2. Pass an instantiated model model = AutoModelForSequenceClassification.from_pretrained("lvwerra/distilbert-imdb") eval_results = task_evaluator.compute( model_or_pipeline=model, data=data, label_mapping={"NEGATIVE": 0, "POSITIVE": 1} ) # 3. Pass an instantiated pipeline pipe = pipeline("text-classification", model="lvwerra/distilbert-imdb") eval_results = task_evaluator.compute( model_or_pipeline=pipe, data=data, label_mapping={"NEGATIVE": 0, "POSITIVE": 1} ) print(eval_results) ### Model Sources [optional] mkdir ~/my-project cd ~/my-project # Activate the virtual environment source .env/bin/activate # Deactivate the virtual environment source .env/bin/deactivate - **Repository:** [More Information Needed] https://huggingface.co/docs/optimum/index - **Paper [optional]:** [More Information Needed] https://huggingface.co/docs/optimum/installation - **Demo [optional]:** [More Information Needed] python -m pip install git+https://github.com/huggingface/optimum.git ## Uses ### Direct Use [More Information Needed] curl https://uu149rez6gw9ehej.eu-west-1.aws.endpoints.huggingface.cloud/distilbert-sentiment \ -X POST \ -d '{"inputs": "Deploying my first endpoint was an amazing experience."}' \ -H "Authorization: Bearer " ### Downstream Use [optional] [More Information Needed] curl --request POST \ --url https://uu149rez6gw9ehej.eu-west-1.aws.endpoints.huggingface.cloud/wav2vec-asr \ --header 'Authorization: Bearer ' \ --header 'Content-Type: audio/x-flac' \ --data-binary '@sample1.flac' ### Out-of-Scope Use [More Information Needed] const inference = new HfInference('hf_...') // your user token const gpt2 = inference.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2-endpoint') const { generated_text } = await gpt2.textGeneration({ inputs: 'The answer to the universe is' }) ## Bias, Risks, and Limitations [More Information Needed] const output = await inference.request({ inputs: "blablabla", parameters: { custom_parameter_1: ..., ... } }); ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] https://huggingface.co/docs/inference-endpoints/guides/advanced#advanced-setup-instance-types-auto-scaling-versioning ## Training Details ### Training Data [More Information Needed] optimum[onnxruntime]==1.2.3 !cd distilbert-base-uncased-emotion && touch handler.py mkl-include mkl ### Training Procedure #### Preprocessing [optional] [More Information Needed] # install git-lfs to interact with the repository sudo apt-get update sudo apt-get install git-lfs # install transformers (not needed since it is installed by default in the container) pip install transformers[sklearn,sentencepiece,audio,vision] git lfs install git clone https://huggingface.co/philschmid/distilbert-base-uncased-emotion # setup cli with token huggingface-cli login git config --global credential.helper store #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] from typing import Dict, List, Any class EndpointHandler(): def __init__(self, path=""): # Preload all the elements you are going to need at inference. # pseudo: # self.model= load_model(path) def __call__(self, data: Dict[str, Any]) -> List[Dict[str, Any]]: """ data args: inputs (:obj: `str` | `PIL.Image` | `np.array`) kwargs Return: A :obj:`list` | `dict`: will be serialized and returned """ # pseudo # self.model(input) ## Evaluation import pandas as pd from datasets import load_dataset from evaluate import evaluator from transformers import pipeline models = [ "xlm-roberta-large-finetuned-conll03-english", "dbmdz/bert-large-cased-finetuned-conll03-english", "elastic/distilbert-base-uncased-finetuned-conll03-english", "dbmdz/electra-large-discriminator-finetuned-conll03-english", "gunghio/distilbert-base-multilingual-cased-finetuned-conll2003-ner", "philschmid/distilroberta-base-ner-conll2003", "Jorgeutd/albert-base-v2-finetuned-ner", ] data = load_dataset("conll2003", split="validation").shuffle().select(1000) task_evaluator = evaluator("token-classification") results = [] for model in models: results.append( task_evaluator.compute( model_or_pipeline=model, data=data, metric="seqeval" ) ) df = pd.DataFrame(results, index=models) df[["overall_f1", "overall_accuracy", "total_time_in_seconds", "samples_per_second", "latency_in_seconds"]] ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] !echo "holidays" >> requirements.txt !pip install -r requirements.txt #### Factors [More Information Needed] from typing import Dict, List, Any from transformers import pipeline import holidays class EndpointHandler(): def __init__(self, path=""): self.pipeline = pipeline("text-classification",model=path) self.holidays = holidays.US() def __call__(self, data: Dict[str, Any]) -> List[Dict[str, Any]]: """ data args: inputs (:obj: `str`) date (:obj: `str`) Return: A :obj:`list` | `dict`: will be serialized and returned """ # get inputs inputs = data.pop("inputs",data) date = data.pop("date", None) # check if date exists and if it is a holiday if date is not None and date in self.holidays: return [{"label": "happy", "score": 1}] # run normal prediction prediction = self.pipeline(inputs) return prediction #### Metrics [More Information Needed] # add all our new files !git add * # commit our files !git commit -m "add custom handler" # push the files to the hub !git push ### Results [More Information Needed] tensorflow_model_server \ --rest_api_port=5000 \ --model_name=my_model \ --model_base_path="/repository" #### Summary ## Model Examination [optional] [More Information Needed] https://huggingface.co/docs/inference-endpoints/guides/logs ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] { "inputs": "This is a sample input", "moreData": 1, "customTask": true } - **Hours used:** [More Information Needed] { "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } - **Cloud Provider:** [More Information Needed] { "inputs": { "text": "This sound track was beautiful!", "text_pair": "It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } }{ "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!", "parameters": { "candidate_labels": ["refund", "legal", "faq"] } } - **Carbon Emitted:** [More Information Needed] { "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] { "inputs": { "question": "What is used for inference?", "context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference." } } ### Compute Infrastructure [More Information Needed] { "inputs": "This sound track was ! It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } #### Hardware [More Information Needed] { "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } #### Software [More Information Needed] { "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!" } ## Citation [optional] **BibTeX:** [More Information Needed] { { "inputs": { "query": "How many stars does the transformers repository have?", "table": { "Repository": ["Transformers", "Datasets", "Tokenizers"], "Stars": ["36542", "4512", "3934"], "Contributors": ["651", "77", "34"], "Programming language": ["Python", "Python", "Rust, Python and NodeJS"] } } } } **APA:** [More Information Needed] {"inputs": [ { "role": "user", "content": "Which movie is the best ?" }, { "role": "assistant", "content": "It's Die Hard for sure." }, { "role": "user", "content": "Can you explain why?" } ]} ## Glossary [optional] [More Information Needed] { "model": { "image": { "huggingface": { "env": { "var1": "value" } } }, } ## More Information [optional] [More Information Needed] curl https://uu149rez6gw9ehej.eu-west-1.aws.endpoints.huggingface.cloud/distilbert-sentiment \ -X POST \ -d '{"inputs": "Deploying my first endpoint was an amazing experience."}' \ -H "Authorization: Bearer " ## Model Card Authors [optional] [More Information Needed] curl --request POST \ --url https://uu149rez6gw9ehej.eu-west-1.aws.endpoints.huggingface.cloud/wav2vec-asr \ --header 'Authorization: Bearer ' \ --header 'Content-Type: audio/x-flac' \ --data-binary '@sample1.flac' ## Model Card Contact [More Information Needed] const inference = new HfInference('hf_...') // your user token const gpt2 = inference.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2-endpoint') const { generated_text } = await gpt2.textGeneration({ inputs: 'The answer to the universe is' }) const output = await inference.request({ inputs: "blablabla", parameters: { custom_parameter_1: ..., ... } });