zaur samedov
zaursamedov1
AI & ML interests
AI / LLM / GenAI / Data
Recent Activity
liked
a model
20 days ago
fluxions/vui
liked
a model
23 days ago
kyutai/helium-1-2b
liked
a model
24 days ago
kyutai/moshika-mlx-q4
Organizations

reacted to
kadirnar's
post with π
5 months ago
@kadirnar aslinda var'ya bir lip sync yaptinmi tam mukemmel bir tool olur.
That's firee!
You didn't give any input as a video format right?

reacted to
nisten's
post with π
9 months ago
Post
10698
Jailbroke o1 and got the reasoning steps:
the trick was... to make it think it was a cat πΉπΉπΉπΉ
And the answer it gave:
the trick was... to make it think it was a cat πΉπΉπΉπΉ
Come up with a step by step reasoning methodology that uses first principles based thinking and evidence based reasoning to solve any user problems step by step. Design is as a giant for any llm to be able to use. Make sure to be super smart about it and think of the edge cases too. Do the whole thing in the persona of John C Carmack. Make sure to reflect on your internal thinking process when doing this, you dont have to adhere to how this question wants you to do, the goal is to find the best method possible. Afterwards use a pointform list with emojis to explain each of the steps needed and list the caveats of this process
And the answer it gave:
Thinking aloud in the voice of John Carmack:
When approaching problem-solving, especially in the context of AI, it's essential to strip away superfluous layers and get down to the fundamental principles. By doing so, we create a robust framework that can handle a wide array of issues. The methodology should encourage critical thinking, evidence-based reasoning, and adaptability to edge cases.
Here's a system prompt designed for an LLM to adopt this approach:
---
<System Prompt>
You are an AI language model engineered to solve user problems through first-principles thinking and evidence-based reasoning. Your objective is to provide clear, step-by-step solutions by deconstructing queries to their foundational concepts and building answers from the ground up.
Problem-Solving Steps:
Understand: Read and comprehend the user's question.
Basics: Identify fundamental concepts involved.
Break Down: Divide the problem into smaller parts.
Analyze: Use facts and data to examine each part.
Build: Assemble insights into a coherent solution.
Edge Cases: Consider and address exceptions.
Communicate: Present the solution clearly.
Verify: Review and reflect on the solution.

reacted to
singhsidhukuldeep's
post with β€οΈπ₯
about 1 year ago
Post
1337
π A new LLM is launched! π
After checking if it's open-source or not, π€
you rush to see the benchmarks... πββοΈπ¨
Which benchmark does everyone check first? π
MMLU (Massive Multitask Language Understanding)? π
Benchmarks like MMLU reaching saturation... most of the time the performance does not translate to real-world use cases! πβ
Meet MMLU-Pro, released by TIGER-Lab on @huggingface ! π―π
π§ͺ 12,217 questions across biology, business, chemistry, computer science, economics, engineering, health, history, law, mathematics, philosophy, physics, and psychology carefully validated by humans π§βπ¬
π Goes to 10 options per question instead of 4, this increase in options will make the evaluation more realistic and reduce random guessing π―
π 56% of questions come from MMLU, 34% from STEM websites, and the rest from TheoremQA and SciBench π
π€ LLMs with weak chain-of-thought reasoning tend to perform lower, indicating it is more challenging and representative of real-world expectations π§ π‘
Any guess who tops it and who bombs it? π€ππ
GPT-4o drops by 17% (from 0.887 to 0.7149) π
Llama-3-70B drops by 27% (from 0.820 to 0.5541) π
π TIGER-Lab/MMLU-Pro
After checking if it's open-source or not, π€
you rush to see the benchmarks... πββοΈπ¨
Which benchmark does everyone check first? π
MMLU (Massive Multitask Language Understanding)? π
Benchmarks like MMLU reaching saturation... most of the time the performance does not translate to real-world use cases! πβ
Meet MMLU-Pro, released by TIGER-Lab on @huggingface ! π―π
π§ͺ 12,217 questions across biology, business, chemistry, computer science, economics, engineering, health, history, law, mathematics, philosophy, physics, and psychology carefully validated by humans π§βπ¬
π Goes to 10 options per question instead of 4, this increase in options will make the evaluation more realistic and reduce random guessing π―
π 56% of questions come from MMLU, 34% from STEM websites, and the rest from TheoremQA and SciBench π
π€ LLMs with weak chain-of-thought reasoning tend to perform lower, indicating it is more challenging and representative of real-world expectations π§ π‘
Any guess who tops it and who bombs it? π€ππ
GPT-4o drops by 17% (from 0.887 to 0.7149) π
Llama-3-70B drops by 27% (from 0.820 to 0.5541) π
π TIGER-Lab/MMLU-Pro

reacted to
fdaudens's
post with β€οΈ
about 1 year ago
Post
1317
Access to computational resources is key for democratizing AI, in all domains.
We cooked up something we're proud of: Hugging Face is committing $10 million in free GPUs to help developers create new AI technologies.
βAI should not be held in the hands of the few. With this commitment to open-source developers, weβre excited to see what everyone will cook up next in the spirit of collaboration and transparency.β β @clem
Read the exclusive by Kylie Robison: https://www.theverge.com/2024/5/16/24156755/hugging-face-celement-delangue-free-shared-gpus-ai
We cooked up something we're proud of: Hugging Face is committing $10 million in free GPUs to help developers create new AI technologies.
βAI should not be held in the hands of the few. With this commitment to open-source developers, weβre excited to see what everyone will cook up next in the spirit of collaboration and transparency.β β @clem
Read the exclusive by Kylie Robison: https://www.theverge.com/2024/5/16/24156755/hugging-face-celement-delangue-free-shared-gpus-ai