Sara Han Díaz's picture

Sara Han Díaz

sdiazlor

AI & ML interests

Data curation and generation, RLHF, RAG, Prompt Engineering

Recent Activity

Organizations

Blog-explorers's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture

sdiazlor's activity

upvoted an article 16 days ago
upvoted an article 19 days ago
view article
Article

I trained a Language Model to schedule events with GRPO!

By anakin87
72
updated a collection 3 months ago
updated a collection 3 months ago
view reply

I guess the tag is generated during completion. However, this might depend on the prompt, the max number of tokens, and how the inference is performed.

view reply

When setting up the HF_TOKEN, did you ensure you granted access to the Inference Endpoints?

upvoted an article 3 months ago
view article
Article

Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies

21
view reply

Hi @Socialmediaprophet ! Sorry that I missed this message. I see that it's running now 🙌. There were some days when the Hub was a bit unstable, so that might have been the root cause of the connection error.

view reply

Hi @beketm ! The main reason is that we directly used the generated completions, but it's true that I missed writing the initial tag.

view reply

Hi! @Aristo2333 ! Regarding DeepSeek, we're using the distilled version, which is available through the Serverless Inference API (https://huggingface.co/docs/api-inference/index), which means that further configuration is not required. Below the Provider indicates that it's available.

Screenshot 2025-02-19 at 14.28.14.png

The Llama issue is raised cause you need to go to the original repository and request access. After approval (in general, quite quick) you'll be able to use the model via the Serverless Inference API too.