Open-Source AI Meetup

community

AI & ML interests

Open science and open source

SFEvent's activity

umarigan 
posted an update 3 days ago
view post
Post
448
** Extracting Reasoning Prompts with DeepSeek-R1: A Step Towards Better AI Reasoning **

Hi everyone! 👋

I’m excited to share a small but impactful project I’ve been working on, where I extracted **reasoning prompts** using the **DeepSeek-R1 model**. Reasoning prompts are a powerful way to understand how AI models arrive at their answers, and they can be used to train smaller, more efficient models to generate reasoning. Let me walk you through the process and explain why this is important.

---

#### **The Code: Extracting Reasoning Prompts**

Here’s the code I used to extract reasoning prompts from the openaccess-ai-collective/oo-gpt4-filtered dataset:

from tqdm import tqdm
import time

reasoning_data = []

for example in tqdm(ds, desc="answering"):
    try:
        response = client.chat.completions.create(
            model='deepseek-reasoner',  # Using DeepSeek-R1 for reasoning
            messages=[
                {"role": "system", "content": example['system_prompt']},
                {"role": "user", "content": example['question']},
            ],
            stream=False,
            max_tokens=4096,
            temperature=0.7,
        )
        
        answer = response.choices[0].message.content
        reasoning = response.choices[0].message.reasoning_content

        reasonng_example = {
            "id": example['id'],
            "question": example['question'],
            'answer': answer,
            'reasoning': reasoning,
        }

        reasoning_data.append(reasonng_example)
    except Exception as e:
        print(f"Error translating example: {e}")
        time.sleep(3)  # Wait for 3 seconds before continuing
        continue  # Skip the current example and move to the next one

data: umarigan/deepseek-r1-reasoning-prompts
jeffboudier 
posted an update 23 days ago
view post
Post
563
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
·
jeffboudier 
posted an update 2 months ago
jeffboudier 
posted an update 4 months ago
jeffboudier 
posted an update 4 months ago
view post
Post
457
Inference Endpoints got a bunch of cool updates yesterday, this is my top 3
jeffboudier 
posted an update 4 months ago
view post
Post
4042
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
·
mrm8488 
posted an update 7 months ago
view post
Post
5016
🚨Exciting news for the Multilingual Synthetic Data Community!🚨

I’ve taken inspiration from the MAGPIE paper on Llama-3-8B-instruct and extended its capabilities. Here’s what’s new!

🗞 The MAGPIE paper showcased that if you use the instruction-tuned version (Llama-3-8B-instruct) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B) on this dataset, you can improve even the it-tuned version

🤔 While reading a script by Sebastian Raschka, PhD, I wondered: Could these advancements be replicated in other languages? Specifically, could they benefit non-English datasets?

🎉 And the answer is YES! At least for Spanish. I've successfully adapted the techniques for Spanish, proving the model's flexibility and multilingual capabilities.

👩‍💻 To make this accessible, I created a basic script (heavily inspired by the Sebastian Raschka one) that allows you to generate similar datasets using ollama models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!
[Script](https://gist.github.com/mrm8488/4650a5e3cc45523798a527a3446eb312)


🔍 Explore the datasets 📚 generated using our new script!

- [Llama-3-8B](https://huggingface.co/datasets/mrm8488/dataset_llama3_5000_samples_es_4231_filtered)
- [Phi-3-medium](https://huggingface.co/datasets/mrm8488/dataset_phi3-medium_5000_samples_es_3906_filtered)
- [Phi-3-mini](https://huggingface.co/datasets/mrm8488/dataset_phi3_5000_samples_es_3282_filtered)


Note: These datasets have basic filtering. Apply additional quality filters before using them to fine-tune large language models.

Inspiration and base script:
https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/05_dataset-generation/llama3-ollama.ipynb
https://www.linkedin.com/feed/update/urn:li:activity:7210982019751661568/
·
radames 
posted an update 8 months ago
view post
Post
6077
Thanks to @OzzyGT for pushing the new Anyline preprocessor to https://github.com/huggingface/controlnet_aux. Now you can use the TheMistoAI/MistoLine ControlNet with Diffusers completely.

Here's a demo for you: radames/MistoLine-ControlNet-demo
Super resolution version: radames/Enhance-This-HiDiffusion-SDXL

from controlnet_aux import AnylineDetector

anyline = AnylineDetector.from_pretrained(
    "TheMistoAI/MistoLine", filename="MTEED.pth", subfolder="Anyline"
).to("cuda")

source = Image.open("source.png")
result = anyline(source, detect_resolution=1280)
radames 
posted an update 9 months ago
view post
Post
6838
At Google I/O 2024, we're collaborating with the Google Visual Blocks team (https://visualblocks.withgoogle.com) to release custom Hugging Face nodes. Visual Blocks for ML is a browser-based tool that allows users to create machine learning pipelines using a visual interface. We're launching nodes with Transformers.js, running models on the browser, as well as server-side nodes running Transformers pipeline tasks and LLMs using our hosted inference. With @Xenova @JasonMayes

You can learn more about it here https://huggingface.co/blog/radames/hugging-face-google-visual-blocks

Source-code for the custom nodes:
https://github.com/huggingface/visual-blocks-custom-components
radames 
posted an update 9 months ago
radames 
posted an update 9 months ago
view post
Post
2538
HiDiffusion SDXL now supports Image-to-Image, so I've created an "Enhance This" version using the latest ControlNet Line Art model called MistoLine. It's faster than DemoFusion

Demo: radames/Enhance-This-HiDiffusion-SDXL

Older version based on DemoFusion radames/Enhance-This-DemoFusion-SDXL

New Controlnet SDXL Controls Every Line TheMistoAI/MistoLine

HiDiffusion is compatible with diffusers and support many SD models - https://github.com/megvii-research/HiDiffusion
  • 1 reply
·
mrm8488 
posted an update 9 months ago
view post
Post
5883
Working on a concept GPT-2 (small) that uses KANs instead of MLPs.
The ckpt and training code will be soon on the hub.
·
jeffboudier 
posted an update 9 months ago
radames 
posted an update 9 months ago
view post
Post
2459
I've built a custom component that integrates Rerun web viewer with Gradio, making it easier to share your demos as Gradio apps.

Basic snippet
# pip install gradio_rerun gradio
import gradio as gr
from gradio_rerun import Rerun

gr.Interface(
    inputs=gr.File(file_count="multiple", type="filepath"),
    outputs=Rerun(height=900),
    fn=lambda file_path: file_path,
).launch()

More details here radames/gradio_rerun
Source https://github.com/radames/gradio-rerun-viewer

Follow Rerun here https://huggingface.co/rerun
radames 
posted an update 9 months ago
radames 
posted an update 9 months ago
radames 
posted an update 10 months ago