Difan Jiao
difanjiao
AI & ML interests
Generative Models & Mech Interp
Recent Activity
upvoted a paper about 12 hours ago
LLM Safety From Within: Detecting Harmful Content with Internal Representations updated a model about 13 hours ago
UofTCSSLab/SIREN-Llama-3.1-8B published a model about 13 hours ago
UofTCSSLab/SIREN-Llama-3.1-8B