LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published 6 days ago β’ 4
view post Post 2381 Google drops Gemini 2.0 Flash Thinkinga new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and morenow available in anychat, try it out: akhaliq/anychat See translation π 6 6 π₯ 4 4 π 1 1 + Reply
view post Post 300 @s3nh Hey man check your discord! Got some news. See translation 4 replies Β· π 1 1 + Reply
Think Beyond Size: Adaptive Prompting for More Effective Reasoning Paper β’ 2410.08130 β’ Published Oct 10 β’ 1
view post Post 3807 QwQ-32B-Preview is now available in anychatA reasoning model that is competitive with OpenAI o1-mini and o1-previewtry it out: akhaliq/anychat See translation 1 reply Β· β€οΈ 3 3 π 2 2 + Reply
view post Post 3676 New model drop in anychatallenai/Llama-3.1-Tulu-3-8B is now availabletry it here: akhaliq/anychat See translation π₯ 4 4 π 1 1 + Reply
view post Post 2667 anychatsupports chatgpt, gemini, perplexity, claude, meta llama, grok all in one apptry it out there: akhaliq/anychat β€οΈ 7 7 π 3 3 π₯ 2 2 + Reply
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19 β’ 47
Cascade-DETR: Delving into High-Quality Universal Object Detection Paper β’ 2307.11035 β’ Published Jul 20, 2023
Behavior Contrastive Learning for Unsupervised Skill Discovery Paper β’ 2305.04477 β’ Published May 8, 2023
Rethinking Memory and Communication Cost for Efficient Large Language Model Training Paper β’ 2310.06003 β’ Published Oct 9, 2023 β’ 2
SemiReward: A General Reward Model for Semi-supervised Learning Paper β’ 2310.03013 β’ Published Oct 4, 2023 β’ 1
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory Paper β’ 2404.11163 β’ Published Apr 17
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper β’ 2405.19893 β’ Published May 30 β’ 29
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences Paper β’ 2406.08128 β’ Published Jun 12
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions Paper β’ 2406.05688 β’ Published Jun 9
RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design Paper β’ 2301.10774 β’ Published Jan 25, 2023
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training Paper β’ 2311.14109 β’ Published Nov 23, 2023
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning Paper β’ 2410.06373 β’ Published Oct 8 β’ 35