Skip to content

lthuynh/apple-on-device-openai-multiserver

 
 

Repository files navigation

Apple On-Device OpenAI API Enhanced with Multi-Server Functionality

A SwiftUI application that creates an OpenAI-compatible API server using Apple's on-device Foundation Models. This allows you to use Apple Intelligence models locally through familiar OpenAI API endpoints.



Screenshots with Configurations and Open-Webui Usage



Features

  • OpenAI Compatible API: Drop-in replacement for OpenAI API with chat completions endpoint
  • Streaming Support: Real-time streaming responses compatible with OpenAI's streaming format
  • On-Device Processing: Uses Apple's Foundation Models for completely local AI processing
  • Model Availability Check: Automatically checks Apple Intelligence availability on startup
  • 🚧 Tool Using (WIP): Function calling capabilities for extended AI functionality



Requirements

  • macOS: 26 beta 2 or greater
  • Apple Intelligence: Must be enabled in Settings > Apple Intelligence & Siri
  • Xcode: 26 beta 2 or greater (MUST match OS version for building)



Building and Installation



Why a GUI App Instead of Command Line?

This project is implemented as a GUI application rather than a command-line tool due to Apple's rate limiting policies for Foundation Models:

"An app that has UI and runs in the foreground doesn't have a rate limit when using the models; a macOS command line tool, which doesn't have UI, does."

β€” Apple DTS Engineer (Source)

⚠️ Important Note: You may still encounter rate limits due to current limitations in Apple FoundationModels. If you experience rate limiting, please restart the server.

⚠️ 重要提醒: η”±δΊŽθ‹Ήζžœ FoundationModels ε½“ε‰ηš„ι™εˆΆοΌŒζ‚¨δ»η„Άε―θƒ½ι‡εˆ°ι€ŸηŽ‡ι™εˆΆγ€‚ε¦‚ζžœι‡εˆ°θΏ™η§ζƒ…ε†΅οΌŒθ―·ι‡ε―ζœεŠ‘ε™¨γ€‚



Usage

Starting the Server

  1. Launch the app
  2. Configure server settings (default: 127.0.0.1:11535)
  3. Click "Start Server"
  4. Server will be available at the configured address



Available Endpoints

Once the server is running, you can access these OpenAI-compatible endpoints:

  • GET /health - Health check
  • GET /status - Model availability and status
  • GET /v1/models - List available models
  • POST /v1/chat/completions - Chat completions (streaming and non-streaming)



Example Usage

Using curl:

curl -X POST http://127.0.0.1:11535/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "apple-on-device",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "stream": false
  }'



Using OpenAI Python client:

from openai import OpenAI

# Point to your local server
client = OpenAI(
    base_url="http://127.0.0.1:11535/v1",
    api_key="not-needed"  # API key not required for local server
)

response = client.chat.completions.create(
    model="apple-on-device",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ],
    temperature=0.7,
    stream=True  # Enable streaming
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")



OpenAI Compatability and Tips



Testing and Development Notes



License

This project is licensed under the MIT License - see the LICENSE file for details.



References



About

OpenAI-compatible API server for Apple on-device models with Multi-Server functionality

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Swift 86.9%
  • Python 13.1%