Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
sam2ai 's Collections
2d-to-3d-image
llm
erase_image_add_image
segment_anything
Ai_Avatar
Llm_long_context
Video_gen
audio_llm
Doc_processing
Text_trajectory_videogen
NerF
MM_LLM
Moe
Recsys
Interpolation
New_llm_arch
speech2text
tinygpt
Text_to_image
Llm_agent
Datasets
Voice2sing

audio_llm

updated Sep 9, 2023
Upvote
-

  • LLaSM: Large Language and Speech Model

    Paper • 2308.15930 • Published Aug 30, 2023 • 34

  • SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

    Paper • 2308.06873 • Published Aug 14, 2023 • 27

  • AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

    Paper • 2308.05734 • Published Aug 10, 2023 • 37

  • JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

    Paper • 2308.04729 • Published Aug 9, 2023 • 32

  • WavJourney: Compositional Audio Creation with Large Language Models

    Paper • 2307.14335 • Published Jul 26, 2023 • 44
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs