Cactus-Compute (Cactus Compute, Inc.)

Organization Card

A cross-platform framework for deploying LLMs, VLMs, Embedding Models, TTS models and more locally on smartphones.

Available in Flutter and React-Native for cross-platform developers.
Supports any GGUF model you can find on Huggingface; Qwen, Gemma, Llama, DeepSeek etc.
Accommodates from FP32 to as low as 2-bit quantized models, for efficiency and less device strain.
MCP tool-calls to make AI performant and helpful (set reminder, gallery search, reply messages) etc.
iOS xcframework and JNILibs for native setups
Neat and tiny C++ build for custom hardware
Chat templates with Jinja2 support

Update pubspec.yaml: Add cactus to your project's dependencies. Ensure you have flutter: sdk: flutter (usually present by default).
```
dependencies:
  flutter:
    sdk: flutter
  cactus: ^0.1.3
```
Install dependencies: Execute the following command in your project terminal:
```
flutter pub get
```

Flutter Text Completion

import 'package:cactus/cactus.dart';

// Initialize
final lm = await CactusLM.init(
    modelUrl: 'huggingface/gguf/link',
    nCtx: 2048,
);

// Completion 
final messages = [CactusMessage(role: CactusMessageRole.user, content: 'Hello!')];
final params = CactusCompletionParams(nPredict: 100, temperature: 0.7);
final response = await lm.completion(messages, params);

// Embedding 
final text = 'Your text to embed';
final params = CactusEmbeddingParams(normalize: true);
final result = await lm.embedding(text, params);

Flutter VLM Completion

import 'package:cactus/cactus.dart';

// Initialize (Flutter handles downloads automatically)
final vlm = await CactusVLM.init(
    modelUrl: 'huggingface/gguf/link',
    mmprojUrl: 'huggingface/gguf/mmproj/link',
);

// Multimodal Completion (can add multiple images)
final messages = [CactusMessage(role: CactusMessageRole.user, content: 'Describe this image')];

final params = CactusVLMParams(
    images: ['/absolute/path/to/image.jpg'],
    nPredict: 200,
    temperature: 0.3,
);

final response = await vlm.completion(messages, params);

N/B: See the Flutter Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and more.

Install the cactus-react-native package: Using npm:

npm install cactus-react-native

Or using yarn:

yarn add cactus-react-native

Install iOS Pods (if not using Expo): For native iOS projects, ensure you link the native dependencies. Navigate to your ios directory and run:
```
npx pod-install
```

React-Native Text Completion

// Initialize
const lm = await CactusLM.init({
    model: '/path/to/model.gguf',
    n_ctx: 2048,
});

// Completion 
const messages = [{ role: 'user', content: 'Hello!' }];
const params = { n_predict: 100, temperature: 0.7 };
const response = await lm.completion(messages, params);

// Embedding 
const text = 'Your text to embed';
const params = { normalize: true };
const result = await lm.embedding(text, params);

React-Native VLM

// Initialize
const vlm = await CactusVLM.init({
    model: '/path/to/vision-model.gguf',
    mmproj: '/path/to/mmproj.gguf',
});

// Multimodal Completion (can add multiple images)
const messages = [{ role: 'user', content: 'Describe this image' }];

const params = {
    images: ['/absolute/path/to/image.jpg'],
    n_predict: 200,
    temperature: 0.3,
};

const response = await vlm.completion(messages, params);

N/B: See the React Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and various options.

Cactus backend is written in C/C++ and can run directly on any ARM/X86/Raspberry PI hardware like phones, smart tvs, watches, speakers, cameras, laptops etc.

Setup You need CMake 3.14+ installed, or install with brew install cmake (on macOS) or standard package managers on Linux.

Build from Source

git clone https://github.com/your-org/cactus.git
cd cactus
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

CMake Integration Add to your CMakeLists.txt:

# Add Cactus as subdirectory
add_subdirectory(cactus)

# Link to your target
target_link_libraries(your_target cactus)
target_include_directories(your_target PRIVATE cactus)

# Requires C++17 or higher

Basic Text Completion

#include "cactus/cactus.h"
#include <iostream>

int main() {
    cactus::cactus_context context;
    
    // Configure parameters
    common_params params;
    params.model.path = "model.gguf";
    params.n_ctx = 2048;
    params.n_threads = 4;
    params.n_gpu_layers = 99; // Use GPU acceleration
    
    // Load model
    if (!context.loadModel(params)) {
        std::cerr << "Failed to load model" << std::endl;
        return 1;
    }
    
    // Set prompt
    context.params.prompt = "Hello, how are you?";
    context.params.n_predict = 100;
    
    // Initialize sampling
    if (!context.initSampling()) {
        std::cerr << "Failed to initialize sampling" << std::endl;
        return 1;
    }
    
    // Generate response
    context.beginCompletion();
    context.loadPrompt();
    
    while (context.has_next_token && !context.is_interrupted) {
        auto token_output = context.doCompletion();
        if (token_output.tok == -1) break;
    }
    
    std::cout << "Response: " << context.generated_text << std::endl;
    return 0;
}

To learn more, see the C++ Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and more.

Device	Gemma3 1B Q4 (toks/sec)	Qwen3 4B Q4 (toks/sec)
iPhone 16 Pro Max	54	18
iPhone 16 Pro	54	18
iPhone 16	49	16
iPhone 15 Pro Max	45	15
iPhone 15 Pro	45	15
iPhone 14 Pro Max	44	14
OnePlus 13 5G	43	14
Samsung Galaxy S24 Ultra	42	14
iPhone 15	42	14
OnePlus Open	38	13
Samsung Galaxy S23 5G	37	12
Samsung Galaxy S24	36	12
iPhone 13 Pro	35	11
OnePlus 12	35	11
Galaxy S25 Ultra	29	9
OnePlus 11	26	8
iPhone 13 mini	25	8
Redmi K70 Ultra	24	8
Xiaomi 13	24	8
Samsung Galaxy S24+	22	7
Samsung Galaxy Z Fold 4	22	7
Xiaomi Poco F6 5G	22	6

We are completely open-source and would appreciate feedback!

Repo: https://github.com/cactus-compute/cactus

models 15

datasets 0

None public yet

Cactus Compute, Inc.

AI & ML interests

Recent Activity

models 15

Cactus-Compute/Jan-Nano-GGUF

Cactus-Compute/OuteTTS-0.2-500m-GGUF

Cactus-Compute/Gemma3-4B-Instruct-GGUF

Cactus-Compute/Qwen2.5-Omni-3B-GGUF

Cactus-Compute/Qwen3-4B-Instruct-GGUF

Cactus-Compute/Gemma3-1B-Instruct-GGUF

Cactus-Compute/Qwen3-1.7B-Instruct-GGUF

Cactus-Compute/Qwen3-600m-Instruct-GGUF

Cactus-Compute/Qwen3-embedding-600m-GGUF

Cactus-Compute/Qwen2.5-VL-3B-Instruct-GGUF

datasets 0

AI & ML interests

Recent Activity

Team members 2

models 15 Sort: Recently updated

datasets 0

models 15