4 118 37

Daniel Huynh PRO

dhuynh95

dhuynh95

AI & ML interests

None yet

Recent Activity

posted an update 6 days ago

🚀 Built an MVP this weekend of Screenshot to HTML to quickly turn screenshots of mocks, competitors or inspiration into a website using Gemini Flash! 🤗 Try it on Hugging Face Space for free here: https://huggingface.co/spaces/dhuynh95/screenshot_to_html 🧠 You will need to get a Gemini API key, but little known fact: it’s free! Google has really shipped with Gemini 2.5 and the Flash model can be used for free. Great for experimentations. In this demo, you can see how we can use AI to turn a screenshot of a website into a fully interactive static HTML page using Gemini. 🏴‍☠️ It was fun building it and to get back to weekend hacking. I tried many things for fun, such as using Gemini Flash to locate assets and recreate them but it was not very successful. Tried other models but the fact that Gemini Flash is both smart AND free is a game changer. It’s great for builders!

updated a Space 6 days ago

dhuynh95/screenshot_to_html

published a Space 7 days ago

dhuynh95/screenshot_to_html

View all activity

Organizations

dhuynh95's activity

upvoted 4 papers about 2 months ago

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead

Paper • 2504.00294 • Published Mar 31 • 10

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 118

Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking

Paper • 2503.19855 • Published Mar 25 • 28

UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning

Paper • 2503.21620 • Published Mar 27 • 62

upvoted 2 papers 2 months ago

Can Large Vision Language Models Read Maps Like a Human?

Paper • 2503.14607 • Published Mar 18 • 9

Where do Large Vision-Language Models Look at when Answering Questions?

Paper • 2503.13891 • Published Mar 18 • 8

upvoted 14 papers 3 months ago

On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

Paper • 2503.03962 • Published Mar 5 • 4

How to Steer LLM Latents for Hallucination Detection?

Paper • 2503.01917 • Published Mar 1 • 11

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation

Paper • 2503.02972 • Published Mar 4 • 25

MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs

Paper • 2502.17422 • Published Feb 24 • 7

Introducing Visual Perception Token into Multimodal Large Language Model

Paper • 2502.17425 • Published Feb 24 • 15

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Paper • 2502.17535 • Published Feb 24 • 8

VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model

Paper • 2502.18906 • Published Feb 26 • 12

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published Feb 26 • 28

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

Paper • 2502.14302 • Published Feb 20 • 9

VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Paper • 2502.12084 • Published Feb 17 • 30

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Paper • 2502.08235 • Published Feb 12 • 58