MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 126
Sensor-based Multi-Robot Search and Coverage with Spatial Separation in Unstructured Environments Paper • 2403.01710 • Published Mar 4 • 2
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models Paper • 2308.14352 • Published Aug 28, 2023
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems Paper • 2306.12691 • Published Jun 22, 2023 • 2
MicroNAS: Memory and Latency Constrained Hardware-Aware Neural Architecture Search for Time Series Classification on Microcontrollers Paper • 2310.18384 • Published Oct 27, 2023 • 2
Pattern Discovery in Time Series with Byte Pair Encoding Paper • 2106.00614 • Published May 30, 2021 • 2
Towards a World-English Language Model for On-Device Virtual Assistants Paper • 2403.18783 • Published Mar 27 • 4
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs Paper • 2403.20041 • Published Mar 29 • 34
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257