LLaMA and other large language models locally on iOS and MacOS.

Various inferences and sampling methods.


One app for MacOS an iOS

Made with SwiftUI

Simplicity and speed of development. One code for all devices. Low entry threshold.

View on github
Google Design
Open Source

Open Source

The core is a Swift library based on llama.cpp, ggml and other open source projects that allows you to perform various inferences. A class hierarchy has been developed that allows you to add your own inference.

View Core repo

Features

Various inferences
LLaMA, RWKV, Falcon
Starcoder, Replit
GPTNeoX, GPT2 + Cerebras
Various sampling methods
Temperature
Mirostat v1,v2
Greedy
Metal
Metal acceleration for llama inference
Model settings templates
Allow you to quick configure downloaded model for each device

Tested models

You can download this models from huginfaces

LLaMA 2

Popular model from Meta

7B
ORCA

A LLama2-7b model trained on Orca Style datasets.

3B
7B
RWKV 4 Raven

RWKV-4 "Raven" models finetuned on Alpaca, CodeAlpaca, Guanaco, GPT4All, ShareGPT and more. Even the 1.5B model is surprisingly good for its size.

1.5B
MagicPrompt Stable Diffusion

This is a model, which are GPT-2 models intended to generate prompt texts for imaging AIs.

Cerebras

The Cerebras-GPT family is released to facilitate research into LLM scaling laws using open architectures and data sets and demonstrate the simplicity of and scalability of training LLMs on the Cerebras software and hardware stack.

0.59B
Stable LM

StableLM-Tuned-Alpha is a suite of 3B and 7B parameter decoder-only language models built on top of the StableLM-Base-Alpha models and further fine-tuned on various chat and instruction-following datasets.

3B