hba123 (Haitham Bou Ammar)

reacted to their post with 🔥 about 2 months ago

Post

2241

Hey amazing people!

We combined DeepSeek R1 with a real-world game to see if robots can play checkers. Well, they kind of can! Check it out:

https://huggingface.co/blog/codyreading/deepseek-checkers

1 reply

·

posted an update about 2 months ago

Post

2241

Hey amazing people!

We combined DeepSeek R1 with a real-world game to see if robots can play checkers. Well, they kind of can! Check it out:

https://huggingface.co/blog/codyreading/deepseek-checkers

1 reply

·

upvoted an article about 2 months ago

Article

Deepseek R1 Robotic Reasoning with Checkers

By

and 4 others •

Mar 5

• 14

published an article about 2 months ago

Article

Deepseek R1 Robotic Reasoning with Checkers

By

and 4 others •

Mar 5

• 14

commented a paper 2 months ago

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12, 2024 • 63 •

6

authored a paper 3 months ago

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Paper • 2502.01208 • Published Feb 3 • 11

reacted to their post with 😎🔥 3 months ago

Post

1765

We developed a method that ensures almost-sure safety (i.e., safety with probability approaching 1). We proved this result. We then, present a practical implementation which we call InferenceGuard. InferenceGuard has impressive practical results: 91.04% on Alpaca-7B and 100% safety results on Beaver 7B-v3.

Now, it is easy to get high safety results like those if we want a dumb model, e.g., just don't answer or answer with EOS and so on. However, our goal is not to only have safe results, but also to make sure that the rewards are high - we want a good trade-off between safety and rewards! That's exactly, what we show. InferenceGuard achieves that!

Check it out: Almost Surely Safe Alignment of Large Language Models at Inference-Time (2502.01208)

posted an update 3 months ago

Post

1765

We developed a method that ensures almost-sure safety (i.e., safety with probability approaching 1). We proved this result. We then, present a practical implementation which we call InferenceGuard. InferenceGuard has impressive practical results: 91.04% on Alpaca-7B and 100% safety results on Beaver 7B-v3.

Now, it is easy to get high safety results like those if we want a dumb model, e.g., just don't answer or answer with EOS and so on. However, our goal is not to only have safe results, but also to make sure that the rewards are high - we want a good trade-off between safety and rewards! That's exactly, what we show. InferenceGuard achieves that!

Check it out: Almost Surely Safe Alignment of Large Language Models at Inference-Time (2502.01208)

upvoted a paper 3 months ago

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Paper • 2502.01208 • Published Feb 3 • 11

commented a paper 3 months ago

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Paper • 2502.01208 • Published Feb 3 • 11 •

2

reacted to their post with 🚀 4 months ago

Post

1840

I have some New Year presents for you, #MachineLearning and #AI community! We just opened our code for new state-of-the-art results that beat EAGLE-2 and Medusa #LLM inference.

We also shared the model check pt on @huggingface ! @MatthieuZ

Check the blog out: https://huggingface.co/blog/hba123/sotaspeculativedecoding

upvoted an article 4 months ago

Article

Accelerating Language Model Inference with Mixture of Attentions

By

and 1 other •

Jan 7

• 24

posted an update 4 months ago

Post

1840

I have some New Year presents for you, #MachineLearning and #AI community! We just opened our code for new state-of-the-art results that beat EAGLE-2 and Medusa #LLM inference.

We also shared the model check pt on @huggingface ! @MatthieuZ

Check the blog out: https://huggingface.co/blog/hba123/sotaspeculativedecoding

published an article 4 months ago

Article

Accelerating Language Model Inference with Mixture of Attentions

By

and 1 other •

Jan 7

• 24

liked a model 4 months ago

huawei-noah/MOASpec-Llama-3-8B-Instruct

Updated Jan 7 • 8 • 5

authored 2 papers 4 months ago

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 68

SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks

Paper • 2410.05102 • Published Oct 7, 2024

reacted to their post with 🚀 4 months ago

Post

1823

Blindly applying algorithms without understanding the math behind them is not a good idea frmpv. So, I am on a quest to fix this!

I wrote my first hugging face article on how you would derive closed-form solutions for KL-regularised reinforcement learning problems - what is used for DPO.

Check it out: https://huggingface.co/blog/hba123/derivingdpo

posted an update 4 months ago

Post

1823

Blindly applying algorithms without understanding the math behind them is not a good idea frmpv. So, I am on a quest to fix this!

I wrote my first hugging face article on how you would derive closed-form solutions for KL-regularised reinforcement learning problems - what is used for DPO.

Check it out: https://huggingface.co/blog/hba123/derivingdpo

Haitham Bou Ammar

AI & ML interests

Recent Activity

Organizations

hba123's activity

Deepseek R1 Robotic Reasoning with Checkers

Deepseek R1 Robotic Reasoning with Checkers

Human-like Episodic Memory for Infinite Context LLMs

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Accelerating Language Model Inference with Mixture of Attentions

Accelerating Language Model Inference with Mixture of Attentions

huawei-noah/MOASpec-Llama-3-8B-Instruct

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks