CoreML LLMs optimized for Apple Neural Engine.
Stephen
smpanaro
AI & ML interests
Apple Neural Engine, Quantization
Organizations
quant
-
SqueezeLLM: Dense-and-Sparse Quantization
Paper • 2306.07629 • Published • 4 -
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Paper • 2309.02784 • Published • 2 -
Extreme Compression of Large Language Models via Additive Quantization
Paper • 2401.06118 • Published • 13 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 51
gpt-2 GPTQ
gpt-2 model family quantized using AutoGPTQ.
prune
text to speech
interesting
-
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 28 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 51 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Paper • 2404.06395 • Published • 23
Pythia GPTQ
Pythia model family quantized using AutoGPTQ.
Apple Neural Engine LLMs
CoreML LLMs optimized for Apple Neural Engine.
text to speech
quant
-
SqueezeLLM: Dense-and-Sparse Quantization
Paper • 2306.07629 • Published • 4 -
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Paper • 2309.02784 • Published • 2 -
Extreme Compression of Large Language Models via Additive Quantization
Paper • 2401.06118 • Published • 13 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 51
interesting
-
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 28 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 51 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Paper • 2404.06395 • Published • 23
gpt-2 GPTQ
gpt-2 model family quantized using AutoGPTQ.
Pythia GPTQ
Pythia model family quantized using AutoGPTQ.
prune