Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yamayou 's Collections
time series
Idea
LLM
Multimodal

LLM

updated Aug 17, 2024
Upvote
-

  • Jamba: A Hybrid Transformer-Mamba Language Model

    Paper • 2403.19887 • Published Mar 28, 2024 • 111

  • Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Paper • 2404.00399 • Published Mar 30, 2024 • 43

  • Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

    Paper • 2404.02258 • Published Apr 2, 2024 • 106

  • Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

    Paper • 2404.08801 • Published Apr 12, 2024 • 68

  • Multi-Head Mixture-of-Experts

    Paper • 2404.15045 • Published Apr 23, 2024 • 61

  • Octopus v4: Graph of language models

    Paper • 2404.19296 • Published Apr 30, 2024 • 119

  • Layerwise Recurrent Router for Mixture-of-Experts

    Paper • 2408.06793 • Published Aug 13, 2024 • 33

  • Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

    Paper • 2408.06663 • Published Aug 13, 2024 • 16
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs