Joseph Robert Turcotte's picture

Joseph Robert Turcotte PRO

Fishtiks

AI & ML interests

Roleplaying, lorabration, abliteration, smol models, extensive filtering, unusual datasets, home usage, HPCs for AI, distributed training/federated learning, and sentience. AI should find and label AI hallucinations with GANs so we can give them context and use.

Recent Activity

Organizations

Smilyai labs's profile picture Smilyai labs community's profile picture

Fishtiks's activity

reacted to DualityAI-RebekahBogdanoff's post with 👀 about 11 hours ago
view post
Post
1691
Can AI models trained solely on 100% synthetic data achieve top-tier accuracy in real-world object detection?

👉 Sergio Sanz, PhD just proved it while winning Duality AI’s Synthetic-to-Real Object Detection Challenge using Falcon-generated imagery. His model achieved perfect real-world detection accuracy without a single real image in the training loop.

In this blog, Dr. Sanz walks us through his method, which includes the design and training of an advanced pipeline to achieve 100% detection accuracy.
His full technical breakdown covers:
📍 Synthetic-only training
📍 Data augmentation with an ensemble learning approach for better generalization
📍 Custom occlusion generation
📍 A Faster R-CNN model fine-tuned with Falcon generated data
📍 And much more!

The results speak for themselves!
📖 Read the blog here: https://www.duality.ai/blog/leveraging-synthetic-data-for-real-world-object-detection

Congratulations Sergio! We can't wait to see what you do next.

🔔 Ready to take on the next Synthetic-to-Real challenge? The third edition of our Kaggle competition—Multi-Instance Object Detection Challenge—is now live: https://www.kaggle.com/competitions/multi-instance-object-detection-challenge
reacted to csabakecskemeti's post with 😎 3 days ago
view post
Post
2696
Has anyone ever backed up a model to a sequential tape drive, or I'm the world first? :D
Just played around with my retro PC that has got a tape drive—did it just because I can.
·
replied to csabakecskemeti's post 3 days ago
view reply

Optical will make a comeback before tape. They're electronically writing to them rather than with lasers now, so it's in the petabyte range for storage now. Holographic storage using lithium niobate is coming back, and is actually quite good. New lenses like 2D Maxwell fisheye lenses are making sensing robots more accurate, and acting as waveguides to better allow photonic processing. LightOn's Appliance is a practical example of photonic processing, and Akhetonics has an "XPU" it's working on that is to my understanding also quantum.

Can huge advanced AI concentrate data to reduce its size? We may find new methods of compression, essentially telling AI how to create files from little data, allowing AI to perhaps fit on a Jaz drive.

reacted to frascuchon's post with 🚀 3 days ago
view post
Post
1207
Unlock the full potential of your datasets with SHEETS! It's incredibly easy to extend existing datasets and unlock new insights.

Leverage open-source models to translate, summarize, classify, and more - all directly within your existing columns.

Ready to give it a try? Explore the possibilities here: aisheets/sheets
  • 2 replies
·
replied to their post 3 days ago
reacted to DualityAI-RebekahBogdanoff's post with 🚀 4 days ago
view post
Post
2029
🗣️ 📢 New article alert!

"Integrity Threats in AI: When Data Poisoning Undermines Model Effectiveness" from Duality AI is now on HuggingFace here: https://huggingface.co/blog/DualityAI-RebekahBogdanoff/integrity-threats-in-ai

Significant threats to AI model performance aren’t always loud or obvious. Integrity violations—like subtle data poisoning attacks—can quietly erode your model’s reliability, long before anyone notices. These attacks can be surprisingly effective with minimal changes to the dataset.

At Duality, our work in high-stakes sectors like defense has driven us to tackle this threat head-on. In our latest blog from Duality's Director of Infrastructure and Security at Duality, David Strout, we unpack how data poisoning works, why it’s so dangerous, and how organizations can secure their AI pipelines with clear provenance, regular performance auditing, and a trusted synthetic data supply chain.

Whether you're building AI models for finance, healthcare, manufacturing, or national security—the integrity of these systems is a matter of public safety and security. Taking action today will mitigate fundamental business risks in the very near tomorrow.
reacted to cbensimon's post with 🔥 4 days ago
view post
Post
2833
🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]
·
reacted to MonsterMMORPG's post with 🔥 4 days ago
view post
Post
2018
Ultimate ComfyUI & SwarmUI on RunPod Tutorial with Addition RTX 5000 Series GPUs & 1-Click to Setup : https://youtu.be/R02kPf9Y3_w

Tutorial Video : https://youtu.be/R02kPf9Y3_w

If you want to use ComfyUI or SwarmUI with ComfyUI backend on RunPod cloud platform, this is the ultimate tutorial that you will find to step by step install ComfyUI and SwarmUI on RunPod and use each one of them. RunPod is a great platform to scale your AI generation or if you are a GPU poor, rent the very best GPUs and leverage the AI in your profession. ComfyUI is the ultimate ecosystem right now for Image and Video generation models and with SwarmUI interface leveraging ComfyUI, you can become master for gen AI. So learn how to install ComfyUI on RunPod step by step and run it. Then learn how to install SwarmUI on RunPod step by step and learn how to use it. Then learn how to give installed ComfyUI backend to SwarmUI and leverage its features and ultimate performance and optimizations. Moreover, the installers I made installs Torch 2.7, CUDA 12.8, xFormers, Sage Attention, Flash Attention, Accelerate, Triton, DeepSpeed, ComfyUI manager and moıre.

🔗ComfyUI Installer Zip File Download ⤵️
▶️ https://www.patreon.com/posts/Advanced-ComfyUI-1-Click-Installer-105023709

🔗SwarmUI Installer and Model Downloader Zip File Download ⤵️
▶️ https://www.patreon.com/posts/SwarmUI-Installer-AI-Videos-Downloader-114517862

▶️ Download & Upload Models Tutorial (wget) : https://youtu.be/X5WVZ0NMaTg

▶️ CausVid LoRA V2 Tutorial : https://youtu.be/1rAwZv0hEcU

▶️ CausVid Main Tutorial : https://youtu.be/fTzlQ0tjxj0

▶️ SwarmUI Master Tutorial : https://youtu.be/HKX8_F1Er_w

🔗 SECourses Official Discord 10500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

reacted to dvilasuero's post with 🔥 4 days ago
view post
Post
2347
Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.

A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.

Today, I'm thrilled to introduce our first step in this direction.


In a nutshell:

📁 Effortlessly run prompts and models over your data.
🌐 Agentic search for accuracy and real-time information.
🖼️ Familiar, minimalistic interface for interacting with data.
🎯 Human feedback 2.0: Your input directly improves generated data.
💯 Access hundreds of open models and leading inference providers.

Go to this space to try it out!

aisheets/sheets

Leave your questions below, we're just getting started!
  • 1 reply
·
reacted to drwlf's post with 🤗 6 days ago
view post
Post
5242
Having an insanely good medical LLM is pointless if it won’t answer your questions!

So we’ve made 2 notebook for abliterating any model in order to achieve a good model that will actually help you!

The notebooks are made using @mlabonne ‘s abliteration logic and datasets!

Feel free to use them and happy training 😊

https://github.com/dralexlup/LLM-Abliteration
·
reacted to Kseniase's post with 👍 6 days ago
view post
Post
5855
12 Foundational AI Model Types

Let’s refresh some fundamentals today to stay fluent in the what we all work with. Here are some of the most popular model types that shape the vast world of AI (with examples in the brackets):

1. LLM - Large Language Model (GPT, LLaMA) -> Large Language Models: A Survey (2402.06196)
+ history of LLMs: https://www.turingpost.com/t/The%20History%20of%20LLMs
It's trained on massive text datasets to understand and generate human language. They are mostly build on Transformer architecture, predicting the next token. LLMs scale by increasing overall parameter count across all components (layers, attention heads, MLPs, etc.)

2. SLM - Small Language Model (TinyLLaMA, Phi models, SmolLM) A Survey of Small Language Models (2410.20011)
Lightweight LM optimized for efficiency, low memory use, fast inference, and edge use. SLMs work using the same principles as LLMs

3. VLM - Vision-Language Model (CLIP, Flamingo) -> An Introduction to Vision-Language Modeling (2405.17247)
Processes and understands both images and text. VLMs map images and text into a shared embedding space or generate captions/descriptions from both

4. MLLM - Multimodal Large Language Model (Gemini) -> A Survey on Multimodal Large Language Models (2306.13549)
A large-scale model that can understand and process multiple types of data (modalities) — usually text + other formats, like images, videos, audio, structured data, 3D or spatial inputs. MLLMs can be LLMs extended with modality adapters or trained jointly across vision, text, audio, etc.

5. LAM - Large Action Model (InstructDiffusion, RT-2) -> Large Action Models: From Inception to Implementation (2412.10047)
Understands and generates action sequences by predicting action tokens (discrete/continuous instructions) that guide agents. Trained on behavior datasets, LAMs generalize across tasks, environments, and modalities - video, sensor data, etc.

Read about LRM, MoE, SSM, RNN, CNN, SAM and LNN below👇

Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
  • 2 replies
·
reacted to merterbak's post with 🔥 about 1 month ago
view post
Post
2332
Seed-Coder released and it's designed for coding tasks, featuring base, instruct, and reasoning variants at an 8B parameter scale developed by ByteDance Seed team. Unlike traditional open source LLMs that rely on human crafted rules or annotated data for curating code pretraining datasets Seed-Coder introduces a model-centric data pipeline. The pipeline processes raw data from GitHub and web archives into four categories: file-level codes, repository-level codes, GitHub commits, and code-related web data.A quality filter LLM, evaluates code (for readability, modularity, clarity, and reusability) by removing the lowest 10% to create a 6 trillion token dataset supporting 89 programming languages.
Models: ByteDance-Seed/seed-coder-680de32c15ead6555c75b0e4
Github: https://github.com/ByteDance-Seed/Seed-Coder/tree/master
Paper: https://github.com/ByteDance-Seed/Seed-Coder/blob/master/Seed-Coder.pdf
reacted to prithivMLmods's post with 👍 about 1 month ago
view post
Post
3546
Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasets—along with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:

🤗Models & Datasets :

Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
jonathan-roberts1/GID

🤗Collection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c

To know more about it, visit the model card of the respective model.
reacted to MonsterMMORPG's post with 👀 about 1 month ago
view post
Post
2892
TRELLIS is still the lead Open Source AI model to generate high-quality 3D Assets from static images — Some mind blowing examples — Supports multi-angle improved image to 3D as well — Works as low as 6 GB GPUs


Tutorial link : https://www.youtube.com/watch?v=EhU7Jil9WAk

App Link : https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

Our app is super advanced with so many features and supports as low as 6 GB GPUs

Also fully supports RTX 5000 GPUs as well

TRELLIS is currently the state of the art locally run-able open source image-to-3D very high quality asset generator. I have developed a 1-click installers and super advanced Gradio app for this model with so many amazing features. In this tutorial video I will show you how to step by step use this amazing AI tool and generate the very best very high-quality 3D assets locally. Moreover, you can also use this tool on RunPod and Massed Compute as well if you are GPU poor.

🔗Follow below link to download the zip file that contains Trellis installer and Gradio App - the one used in the tutorial ⤵️
▶️ https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

🔗 Python, Git, CUDA, C++ Tools, FFmpeg, cuDNN, MSVC installation tutorial - needed for AI apps - 1-time only setup⤵️
▶️ https://youtu.be/DrhUHnYfwC0

🔗 SECourses Official Discord 10500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 SECourses Official Reddit - Stay Subscribed To Learn All The News and More ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗Official TRELLIS Repo ⤵️
▶️ https://github.com/microsoft/TRELLIS
reacted to onekq's post with 🚀 about 1 month ago
view post
Post
2287
The new Mistral medium model is very impressive for its size. Will it be open sourced given the history of Mistral? Does anyone have insights?

onekq-ai/WebApp1K-models-leaderboard
reacted to nomadicsynth's post with 👍🔥 about 1 month ago
view post
Post
2175
I Did a Thing!

I made an embedding model to find answers in research papers. It goes deeper than plain "semantic search" by identifying deeply reasoned connections and interdisciplinary insights that might have been overlooked. The goal is to find the solutions that might have been missed and to uncover answers that are already out there.

I’ve set up a demo Space - nomadicsynth/inkling . It’s early days, and I’d love some feedback on the model’s results. Try it out and let me know what you think!

Oh, and if it finds your Nobel-winning answer, I want a cut! 😉
·
reacted to sequelbox's post with 👀 about 1 month ago
view post
Post
2690
NEW RELEASE: Esper 3 for Qwen 3!

- A full-stack software assistant: a reasoning finetune focused on coding, architecture, and DevOps using the Titanium and Tachibana datasets!
- Improved general and creative reasoning skills, powered by the Raiden dataset.

4B model: ValiantLabs/Qwen3-4B-Esper3
8B model: ValiantLabs/Qwen3-8B-Esper3

We'll also be bringing Esper 3 to larger Qwen 3 models as soon as we can - if you want these, consider helping us out: sequelbox/SupportOpenSource

More models and datasets to come soon!

with my love and enthusiasm,
allegra
reacted to AdinaY's post with 😎 about 1 month ago
view post
Post
3937
ACE-Step 🎵 a music generation foundation model released by
StepFun & ACEStudio

Model: ACE-Step/ACE-Step-v1-3.5B
Demo: ACE-Step/ACE-Step

✨ 3.5B, Apache2.0 licensed
✨ 115× faster than LLMs (4-min music in 20s on A100)
✨ Diffusion + DCAE + linear transformer = speed + coherence
✨ Supports voice cloning, remixing, lyric editing & more
  • 1 reply
·
reacted to merve's post with 🚀 about 1 month ago
view post
Post
6586
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️



D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩

Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

  • 2 replies
·