@Jaward on Hugging Face: "fascinating read! staying bullish on search with rl might just help us get rid…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

Jaward

posted an update 18 days ago

Post

4178

fascinating read!
staying bullish on search with rl might just help us get rid of hallucination entirely. I really like their approach:
1) <think>on prompt/context && what u know </think>
2) self <search>when u don’t know</search> (iteratively) with no external tool
3) <information>cite sources to support claim(s)</information>
4) <answer>final answer</answer>
their rl training was done cost efficiently too, see code: https://github.com/TsinghuaC3I/SSRL

iseesaw

17 days ago

thanks for sharing our work

Jaward

17 days ago

you're welcome, nice work.

In this post

Jaward Jaward Sesay
iseesaw Kaiyan Zhang