--- license: apache-2.0 language: - en pipeline_tag: question-answering --- # litert-community/Gecko-110m-en This model provides a few variants of the embedding model published in the [Gecko paper](https://arxiv.org/abs/2403.20327) that are ready for deployment on Android or iOS using [LiteRT stack](https://ai.google.dev/edge/litert) or [google ai edge RAG SDK](https://ai.google.dev/edge/mediapipe/solutions/genai/rag). ## Use the models ### Android * Try out the gecko embedding model in the [google ai edge RAG SDK](https://ai.google.dev/edge/mediapipe/solutions/genai/rag). You can find the SDK on [GitHub](https://github.com/google-ai-edge/ai-edge-apis/tree/main/local_agents/rag) or follow our [android guide](https://ai.google.dev/edge/mediapipe/solutions/genai/rag/android) to install directly from Maven. We have also published a [sample app](https://github.com/google-ai-edge/ai-edge-apis/tree/main/examples/rag). * Use the sentencepiece model as the tokenizer for the Gecko embedding model. ## Performance ### Android Note that all benchmark stats are from a Samsung S23 Ultra.
Backend | Max sequence length | Init time (ms) | Inference time (ms) | Memory (RSS in MB) | Model size (MB) | |
---|---|---|---|---|---|---|
dynamic_int8 |
GPU |
256 |
1306.06 |
76.2 |
604.5 |
114 |
dynamic_int8 |
GPU |
512 |
1363.38 |
173.2 |
604.6 |
120 |
dynamic_int8 |
GPU |
1024 |
1419.87 |
397 |
871.1 |
145 |
dynamic_int8 |
CPU |
256 |
11.03 |
147.6 |
126.3 |
114 |
dynamic_int8 |
CPU |
512 |
30.04 |
353.1 |
225.6 |
120 |
dynamic_int8 |
CPU |
1024 |
79.17 |
954 |
619.5 |
145 |