Building an AI Chatbot in SwiftUI with Foundation Models Framework

Apple just changed the game at WWDC 2025 with the Foundation Models framework. For the first time, you can now run the exact same AI model that powers Apple Intelligence directly inside your own iOS apps. No internet connection needed, no OpenAI API bills, just pure on-device artificial intelligence.

Because I really want to know how capable the local llm is, I build a more complex app. This isn’t just another demo project. We’re building a real-world AI chatbot that can handle complex conversations, remember context, and even enhance its knowledge with custom data when needed.

⬇️ Download project files

Building a Real-World AI Chat App

Our demo app is a “Dog Helper” that showcases practical AI implementation. Users can:

  1. Ask preset questions like “Tell me about Border Collies”
  2. Have natural conversations – ask follow-ups like “Are they good with kids?”
  3. Get enhanced answers – when the local model lacks knowledge, we use tool calling to fetch additional data
  4. Handle unknown breeds – for rare breeds like “Caucasian Shepherd,” the AI gracefully uses our custom data

The magic happens when you ask about allergy-friendly dogs. Instead of generic answers, our tool calling system searches through curated breed data and returns specific recommendations.

building a ai chatbot app with foundation models framework

What is Apple’s Foundation Models Framework?

The Foundation Models framework gives you direct access to Apple’s local large language model (LLM) – the same AI brain behind Siri’s new capabilities and Apple Intelligence features. Think of it as having ChatGPT built right into your iPhone, but completely private and offline.

This is huge because:

  • Zero API costs – no more paying per request to OpenAI or Claude
  • Complete privacy – all AI processing happens on-device
  • Lightning fast – no network latency, instant responses
  • Always available – works in airplane mode or poor connectivity

Understanding Foundation Models vs Apple’s Foundation Framework – Common Naming Confusion

Don’t get confused by the name! Apple’s Foundation Models framework is completely different from their older Foundation framework.

In AI terminology, a “foundation model” means a general-purpose base model that you can customize for specific tasks. It’s called “foundation” because it’s the foundation you build specialized AI features on top of.

Foundation Models Performance and Limitations

Technical Specifications That Matter:

  • 3 billion parameters (vs ChatGPT’s 100+ billion)
  • 3GB RAM usage (why newer devices are required)
  • 4,096 token context window (conversations have memory limits)
  • Text-only input/output (no image processing)
  • Knowledge cutoff: End of 2023
  • 16 language support (check model.supportedLanguages)
  • Adapter compatibility for fine-tuning specific use cases

What This Model Excels At:

  • Text summarization – Great for condensing content
  • Information extraction – Pull structured data from text
  • Content classification – Categorize text by type/topic
  • Simple content generation – Basic writing tasks

What To Avoid Using It For:

  • Philosophy or complex reasoning – Too small for deep thinking
  • Current events – Knowledge cutoff limitations
  • Advanced math – Prone to calculation errors
  • Creative writing – Limited compared to larger models

Device Compatibility Requirements – The Hard Truth About Apple Intelligence Availability

Here’s what you need before diving in – and it’s more limited than you might think:

iPhone Compatibility for Foundation Models Framework

  • iPhone 15 Pro and 15 Pro Max (A17 Pro chip)
  • All iPhone 16 models (A18 chip)
  • Older iPhones won’t work – sorry iPhone 13/14 users

iPad Support for On-Device AI

  • iPad Mini 7th gen (A17 Pro)
  • iPad Air M1/M2 models
  • iPad Pro M1/M2/M4 models
  • Older iPads are incompatible – even recent non-Pro models

Mac Requirements for Foundation Models

  • Any Mac with M1, M2, M3, or M4 chips
  • Intel Macs are completely unsupported
  • macOS Sequoia 15.2+ required

Plus, Apple Intelligence must be downloaded and enabled (3GB download, 30-minute setup).

Market Reality – How Many Users Can Actually Use Your AI App?

Let’s talk numbers. Based on a current device adoption roughly:

  • Only 15% of iPhones can run Foundation Models apps
  • 30% of iPads are compatible (thanks to M1 adoption)
  • 50% of Macs support it (M-series popularity growing)

This means your app needs robust fallbacks. Don’t build AI-only features – always have non-AI alternatives ready.

How to Check Device Compatibility

Smart developers always check compatibility first. Here’s the complete implementation:

import SwiftUI
import FoundationModels

struct ContentView: View {
    private var model = SystemModel.default

    var body: some View {
        switch model.availability {
        case .available:
            // Device supports AI - show full features
            ChatbotView()

        case .unavailable(.modelNotReady):
            // Compatible device, but AI model still downloading
            ModelDownloadingView()

        case .unavailable(.deviceNotEligible):
            // Old device - offer alternative features
            NonAIFallbackView()

        case .unavailable(.appleIntelligenceNotEnabled):
            // User needs to enable Apple Intelligence
            EnableIntelligenceView()
        }
    }
}

Testing Device Compatibility States in Xcode Simulator

Apple made testing easy with built-in simulator options. In your Xcode scheme settings:

  1. Go to Run → Arguments → Environment Variables
  2. Find “Simulating foundation model availability”
  3. Set values like:
    • deviceNotEligible – Test old device flow
    • appleIntelligenceNotEnabled – Test setup flow
    • modelNotReady – Test download state

This lets you test all compatibility scenarios without owning multiple devices

xcode testing foundation models framework availability on device

Running Foundation Models in Playgrounds

Let’s start simple with a playground to understand the basics:

import FoundationModels
import Playgrounds

#Playground {
    
    // Create a language model session
    let session = LanguageModelSession()

    // Define your prompt
    let prompt = "What is the meaning of life?"

    // Get AI response (async operation)
    do {
        let response = try await session.respond(to: prompt)
        
        print(response.content) // This is your AI-generated text
    } catch {
        print("AI generation failed: (error)")
    }
}

Performance reality check: This took 23 seconds on an M1 Mac for a philosophical question. The model runs at your device’s speed – no cloud acceleration here.

test new foundation models framework with xcode 26 playground macro

Handling AI Generation Errors and Guardrails

The Foundation Models framework has strict safety measures:

do {
    let response = try await session.response(to: prompt)
    return response.content
} catch let error as FoundationModels.GenerationError {
    switch error.type {
    case .exceedsContextWindowSize:
        // Conversation too long - start new session
        return "Let's start a fresh conversation"

    case .guardrailViolation:
        // Content flagged as unsafe
        return "I can't help with that request"

    case .unsupportedLanguage:
        // User used non-supported language
        return "Please ask in English or another supported language"

    case .rateLimited:
        // App backgrounded - system prioritizing foreground apps
        return "Please try again in a moment"

    case .concurrentRequests:
        // Multiple requests on same session
        return "Please wait for current response to complete"

    default:
        return "Something went wrong with AI generation"
    }
}

Real-World Example – Building a Dog Breed Knowledge Assistant

Let’s test the model’s knowledge boundaries with a practical example:

// Test with well-known breed
let borderColliePrompt = "Tell me about Border Collies for apartment living"
// Result: Good general knowledge, reasonable advice

// Test with rare breed
let caucasianPrompt = "Can I keep a Caucasian Shepherd in an apartment?"
// Result: Hallucination! Says they're "wonderful apartment companions"

The problem: The model confidently gives wrong advice about a 150-pound protective breed being apartment-friendly. This is why tool calling becomes essential.

Why Tool Calling Is Essential for Production AI Apps

The local model has knowledge gaps. When it doesn’t know something, it often guesses wrong instead of admitting ignorance. Tool calling lets you:

  1. Detect knowledge gaps – Recognize when the model lacks specific info
  2. Fetch accurate data – Pull from your curated database
  3. Enhance responses – Combine AI reasoning with factual data
  4. Maintain accuracy – Prevent harmful misinformation

Think of tool calling as giving your AI access to Google, but with your own trusted data sources.

Working Around Text-Only Limitations with Vision Framework Integration

The Foundation Models framework only handles text, but you can build powerful workflows:

// 1. User takes photo of a recipe
// 2. Use Vision framework to extract text from image
// 3. Pass extracted text to Foundation Models
// 4. AI formats and structures the recipe data

This pattern works great for:

  • Document scanning – Extract and process text from images
  • Recipe digitization – Photo to structured recipe data
  • Text translation – OCR + AI translation
  • Content organization – Extract and categorize information

Choosing the Right Use Cases for 3-Billion Parameter Models

The model size matters. Here’s when Foundation Models work well vs when you need alternatives:

Perfect Use Cases

  • App help chatbots – Answer questions about your app’s features
  • Content summarization – Condense long text into key points
  • Data extraction – Pull specific info from unstructured text
  • Simple classification – Categorize content by type or sentiment

Consider Cloud APIs Instead

  • Creative writing – Need larger models for quality output
  • Complex reasoning – Mathematical or logical problem solving
  • Current information – Anything requiring up-to-date knowledge
  • Multi-language support – Beyond the 12 supported languages

Configuring Your Language Model Session for Production Apps

The LanguageModelSession has several important parameters you can customize:

let session = LanguageModelSession(
    model: .default, // Can use custom adapters here
    guardrails: .default, // Safety filters (strict by default)
    tools: [], // We'll add tool calling in part 2
    instructions: """
        You are a dog specialist. Your job is to give helpful 
        advice to new dog owners.
        """
)

Setting Effective System Instructions

Instructions are more powerful than regular prompts. They define your AI’s personality and boundaries:

let instructions = """
You are a dog specialist. Your job is to give helpful advice to new dog owners.
"""

// Test the boundaries
// User asks: "What is 1 + 1?"
// AI responds: "I'm sorry, I cannot assist with that request."

Instructions act as strong guardrails – even when users try to misuse your app, the AI stays in character.

Controlling AI Response Quality with Generation Options

Temperature Settings for Creativity Control

let options = GenerationOptions(
    sampling: .default, // Balanced creativity
    temperature: 1.0,   // 0 = robotic, 2 = chaotic
    maxTokens: nil      // Let it finish thoughts naturally
)

Avoid these common mistakes:

  • temperature: 0 = Always identical, robotic responses
  • temperature: 2 = Too random and incoherent
  • Setting maxTokens too low = Cut off mid-sentence

Better Ways to Control Response Length

Instead of limiting tokens (which cuts off responses), use natural language:

// ❌ Bad: maxTokens: 200 (cuts off mid-sentence)

// ✅ Good: Add to instructions
"Keep responses to 100-200 words"
"Answer in 2 paragraphs" 
"Give brief, concise answers"

Implementing Real-Time Streaming Responses

Nobody wants to wait 23 seconds staring at a blank screen. Streaming shows text as it generates:

class ChatViewModel: ObservableObject {
    @Published var partialGenerated: String.PartialGenerated?
    @Published var isResponding = false
    private var streamingTask: Task<Void, Never>?

    func sendMessage(_ userInput: String) {
        isResponding = true

        streamingTask = Task {
            do {
                let stream = try session.streamResponse(to: userInput)

                for try await partial in stream {
                    // Check if task was cancelled
                    guard !Task.isCancelled else { break }

                    // Update UI with each new token
                    await MainActor.run {
                        self.partialGenerated = partial
                    }
                }

                // Streaming complete
                await MainActor.run {
                    self.isResponding = false
                    self.saveResponse()
                }

            } catch {
                await MainActor.run {
                    self.handleError(error)
                }
            }
        }
    }
}
streaming response in swiftui app with foundatin models framework

Creating Smooth Streaming Animations in SwiftUI

Make your streaming responses feel polished with proper animations:

struct StreamingResponseView: View {
    let partialResponse: String.PartialGenerated

    var body: some View {
        Text(partialResponse.content, format: .markdown)
            .padding()
            .background(.gray.opacity(0.1), in: RoundedRectangle(cornerRadius: 12))
            .contentTransition(.opacity)
            .animation(.bouncy, value: partialResponse)
    }
}

Key animation principles:

  • Use .animation(.bouncy, value: partialResponse) for smooth text updates
  • Add .contentTransition(.opacity) to avoid choppy text changes
  • Keep transition duration short (0.2-0.5 seconds max)

Fixing View Identity Problems

When streaming text replaces with final messages, SwiftUI can get confused about view identity:

// ❌ Causes UI jumps
ForEach(messages) { message in
    MessageView(message: message)
}
if let partial = partialGenerated {
    StreamingView(partial: partial)
}

// ✅ Fixed with consistent IDs
ForEach(messages) { message in
    MessageView(message: message)
        .id(message.id)
}
if let partial = partialGenerated {
    StreamingView(partial: partial)
        .id(partialId ?? UUID()) // Same ID used for final message
}

Managing Chat Sessions and Message History

Handle conversation flow properly with session management:

class ChatViewModel: ObservableObject {
    @Published var messages: [ChatMessage] = []
    @Published var userInput = ""
    private var session: LanguageModelSession
    private let instructions = "You are a dog specialist..."

    func resetSession() {
        // Cancel any ongoing streaming
        streamingTask?.cancel()

        // Clear UI state
        messages.removeAll()
        partialGenerated = nil
        isResponding = false

        // Create fresh session (important!)
        session = LanguageModelSession(instructions: instructions)
    }

    private func saveResponse() {
        guard let partial = partialGenerated else { return }

        // Add AI response to chat history
        messages.append(ChatMessage(
            id: partialId ?? UUID(),
            role: .assistant,
            content: partial.content
        ))

        // Clear streaming state
        partialGenerated = nil
        partialId = nil
    }
}

Proper Progress Indication

Show loading states only when appropriate:

if isResponding && partialGenerated == nil {
    // Show spinner only before streaming starts
    ProgressView()
} else if let partial = partialGenerated {
    // Show streaming content
    StreamingResponseView(partial: partial)
}

Handling Concurrent Requests and Session Limits

Each LanguageModelSession can only handle one request at a time:

func sendMessage(_ input: String) {
    // Prevent multiple concurrent requests
    guard !session.isResponding else { return }

    // Or create multiple sessions for parallel processing
    let newSession = LanguageModelSession(instructions: instructions)
}

Session management strategies:

  • Single session: Simple conversations with memory
  • Multiple sessions: Parallel processing different topics
  • Session pools: Handle high-volume concurrent requests

Real-World Error Handling and User Feedback

The Foundation Models framework can fail in various ways. Handle them gracefully:

private func handleStreamingError(_ error: Error) {
    if let genError = error as? FoundationModels.GenerationError {
        switch genError.type {
        case .guardrailViolation:
            showMessage("I can't help with that type of request")
        case .exceedsContextWindowSize:
            showMessage("This conversation is getting too long. Let's start fresh!")
            resetSession()
        case .rateLimited:
            showMessage("I'm busy with other tasks. Please try again in a moment")
        default:
            showMessage("Something went wrong. Please try again")
        }
    }
}

Performance Optimization Tips for On-Device AI

Memory Management

  • Monitor memory usage – the 3B model uses ~3GB RAM
  • Consider releasing sessions when not needed
  • Test on real devices, not just simulators

Battery Impact

  • Long AI generations drain battery quickly
  • Consider limiting response length for mobile use
  • Show battery usage warnings for intensive tasks

Background Handling

  • AI requests get rate-limited when app goes to background
  • Save conversation state before backgrounding
  • Resume gracefully when returning to foreground

What’s Next: Tool Calling and Knowledge Enhancement

This basic chatbot is just the foundation. The real power comes with tool calling – letting your AI fetch current data, search databases, and enhance its knowledge beyond the 2023 training cutoff.

In the next tutorial, we’ll add:

  • Custom tool functions to fetch dog breed data
  • Intent detection to decide when to use tools
  • Knowledge enhancement for accurate, up-to-date responses
  • Structured data integration with your app’s backend

The Foundation Models framework gives you a solid base for on-device AI, but tool calling makes it production-ready for real-world applications.

⬇️ Download project files


Ready to enhance your chatbot with tool calling? Check out part 2 of this series where we add custom data sources and make our AI truly intelligent.

Leave a Comment

Subscribe to My Newsletter

Want the latest iOS development trends and insights delivered to your inbox? Subscribe to our newsletter now!

Newsletter Form