Building an AI Chatbot in SwiftUI with Foundation Models Framework

Apple just changed the game at WWDC 2025 with the Foundation Models framework. For the first time, you can now run the exact same AI model that powers Apple Intelligence directly inside your own iOS apps. No internet connection needed, no OpenAI API bills, just pure on-device artificial intelligence.

Because I really want to know how capable the local llm is, I build a more complex app. This isn’t just another demo project. We’re building a real-world AI chatbot that can handle complex conversations, remember context, and even enhance its knowledge with custom data when needed.

⬇️ Download project files

Building a Real-World AI Chat App

Our demo app is a “Dog Helper” that showcases practical AI implementation. Users can:

Ask preset questions like “Tell me about Border Collies”
Have natural conversations – ask follow-ups like “Are they good with kids?”
Get enhanced answers – when the local model lacks knowledge, we use tool calling to fetch additional data
Handle unknown breeds – for rare breeds like “Caucasian Shepherd,” the AI gracefully uses our custom data

The magic happens when you ask about allergy-friendly dogs. Instead of generic answers, our tool calling system searches through curated breed data and returns specific recommendations.

building a ai chatbot app with foundation models framework

What is Apple’s Foundation Models Framework?

The Foundation Models framework gives you direct access to Apple’s local large language model (LLM) – the same AI brain behind Siri’s new capabilities and Apple Intelligence features. Think of it as having ChatGPT built right into your iPhone, but completely private and offline.

This is huge because:

Zero API costs – no more paying per request to OpenAI or Claude
Complete privacy – all AI processing happens on-device
Lightning fast – no network latency, instant responses
Always available – works in airplane mode or poor connectivity

Understanding Foundation Models vs Apple’s Foundation Framework – Common Naming Confusion

Don’t get confused by the name! Apple’s Foundation Models framework is completely different from their older Foundation framework.

In AI terminology, a “foundation model” means a general-purpose base model that you can customize for specific tasks. It’s called “foundation” because it’s the foundation you build specialized AI features on top of.

Foundation Models Performance and Limitations

Technical Specifications That Matter:

3 billion parameters (vs ChatGPT’s 100+ billion)
3GB RAM usage (why newer devices are required)
4,096 token context window (conversations have memory limits)
Text-only input/output (no image processing)
Knowledge cutoff: End of 2023
16 language support (check model.supportedLanguages)
Adapter compatibility for fine-tuning specific use cases

What This Model Excels At:

Text summarization – Great for condensing content
Information extraction – Pull structured data from text
Content classification – Categorize text by type/topic
Simple content generation – Basic writing tasks

What To Avoid Using It For:

Philosophy or complex reasoning – Too small for deep thinking
Current events – Knowledge cutoff limitations
Advanced math – Prone to calculation errors
Creative writing – Limited compared to larger models

Device Compatibility Requirements – The Hard Truth About Apple Intelligence Availability

Here’s what you need before diving in – and it’s more limited than you might think:

iPhone Compatibility for Foundation Models Framework

iPhone 15 Pro and 15 Pro Max (A17 Pro chip)
All iPhone 16 models (A18 chip)
Older iPhones won’t work – sorry iPhone 13/14 users

iPad Support for On-Device AI

iPad Mini 7th gen (A17 Pro)
iPad Air M1/M2 models
iPad Pro M1/M2/M4 models
Older iPads are incompatible – even recent non-Pro models

Mac Requirements for Foundation Models

Any Mac with M1, M2, M3, or M4 chips
Intel Macs are completely unsupported
macOS Sequoia 15.2+ required

Plus, Apple Intelligence must be downloaded and enabled (3GB download, 30-minute setup).

Market Reality – How Many Users Can Actually Use Your AI App?

Let’s talk numbers. Based on a current device adoption roughly:

Only 15% of iPhones can run Foundation Models apps
30% of iPads are compatible (thanks to M1 adoption)
50% of Macs support it (M-series popularity growing)

This means your app needs robust fallbacks. Don’t build AI-only features – always have non-AI alternatives ready.

How to Check Device Compatibility

Smart developers always check compatibility first. Here’s the complete implementation:

import SwiftUI
import FoundationModels

struct ContentView: View {
    private var model = SystemModel.default

    var body: some View {
        switch model.availability {
        case .available:
            // Device supports AI - show full features
            ChatbotView()

        case .unavailable(.modelNotReady):
            // Compatible device, but AI model still downloading
            ModelDownloadingView()

        case .unavailable(.deviceNotEligible):
            // Old device - offer alternative features
            NonAIFallbackView()

        case .unavailable(.appleIntelligenceNotEnabled):
            // User needs to enable Apple Intelligence
            EnableIntelligenceView()
        }
    }
}

Testing Device Compatibility States in Xcode Simulator

Apple made testing easy with built-in simulator options. In your Xcode scheme settings:

Go to Run → Arguments → Environment Variables
Find “Simulating foundation model availability”
Set values like:
- deviceNotEligible – Test old device flow
- appleIntelligenceNotEnabled – Test setup flow
- modelNotReady – Test download state

This lets you test all compatibility scenarios without owning multiple devices

xcode testing foundation models framework availability on device

Running Foundation Models in Playgrounds

Let’s start simple with a playground to understand the basics:

import FoundationModels
import Playgrounds

#Playground {
    
    // Create a language model session
    let session = LanguageModelSession()

    // Define your prompt
    let prompt = "What is the meaning of life?"

    // Get AI response (async operation)
    do {
        let response = try await session.respond(to: prompt)
        
        print(response.content) // This is your AI-generated text
    } catch {
        print("AI generation failed: (error)")
    }
}

Performance reality check: This took 23 seconds on an M1 Mac for a philosophical question. The model runs at your device’s speed – no cloud acceleration here.

test new foundation models framework with xcode 26 playground macro

Handling AI Generation Errors and Guardrails

The Foundation Models framework has strict safety measures:

do {
    let response = try await session.response(to: prompt)
    return response.content
} catch let error as FoundationModels.GenerationError {
    switch error.type {
    case .exceedsContextWindowSize:
        // Conversation too long - start new session
        return "Let's start a fresh conversation"

    case .guardrailViolation:
        // Content flagged as unsafe
        return "I can't help with that request"

    case .unsupportedLanguage:
        // User used non-supported language
        return "Please ask in English or another supported language"

    case .rateLimited:
        // App backgrounded - system prioritizing foreground apps
        return "Please try again in a moment"

    case .concurrentRequests:
        // Multiple requests on same session
        return "Please wait for current response to complete"

    default:
        return "Something went wrong with AI generation"
    }
}

Real-World Example – Building a Dog Breed Knowledge Assistant

Let’s test the model’s knowledge boundaries with a practical example:

// Test with well-known breed
let borderColliePrompt = "Tell me about Border Collies for apartment living"
// Result: Good general knowledge, reasonable advice

// Test with rare breed
let caucasianPrompt = "Can I keep a Caucasian Shepherd in an apartment?"
// Result: Hallucination! Says they're "wonderful apartment companions"

The problem: The model confidently gives wrong advice about a 150-pound protective breed being apartment-friendly. This is why tool calling becomes essential.

Why Tool Calling Is Essential for Production AI Apps

The local model has knowledge gaps. When it doesn’t know something, it often guesses wrong instead of admitting ignorance. Tool calling lets you:

Detect knowledge gaps – Recognize when the model lacks specific info
Fetch accurate data – Pull from your curated database
Enhance responses – Combine AI reasoning with factual data
Maintain accuracy – Prevent harmful misinformation

Think of tool calling as giving your AI access to Google, but with your own trusted data sources.

Working Around Text-Only Limitations with Vision Framework Integration

The Foundation Models framework only handles text, but you can build powerful workflows:

// 1. User takes photo of a recipe
// 2. Use Vision framework to extract text from image
// 3. Pass extracted text to Foundation Models
// 4. AI formats and structures the recipe data

This pattern works great for:

Document scanning – Extract and process text from images
Recipe digitization – Photo to structured recipe data
Text translation – OCR + AI translation
Content organization – Extract and categorize information

Choosing the Right Use Cases for 3-Billion Parameter Models

The model size matters. Here’s when Foundation Models work well vs when you need alternatives:

Perfect Use Cases

App help chatbots – Answer questions about your app’s features
Content summarization – Condense long text into key points
Data extraction – Pull specific info from unstructured text
Simple classification – Categorize content by type or sentiment

Consider Cloud APIs Instead

Creative writing – Need larger models for quality output
Complex reasoning – Mathematical or logical problem solving
Current information – Anything requiring up-to-date knowledge
Multi-language support – Beyond the 12 supported languages

Configuring Your Language Model Session for Production Apps

The LanguageModelSession has several important parameters you can customize:

let session = LanguageModelSession(
    model: .default, // Can use custom adapters here
    guardrails: .default, // Safety filters (strict by default)
    tools: [], // We'll add tool calling in part 2
    instructions: """
        You are a dog specialist. Your job is to give helpful 
        advice to new dog owners.
        """
)

Setting Effective System Instructions

Instructions are more powerful than regular prompts. They define your AI’s personality and boundaries:

let instructions = """
You are a dog specialist. Your job is to give helpful advice to new dog owners.
"""

// Test the boundaries
// User asks: "What is 1 + 1?"
// AI responds: "I'm sorry, I cannot assist with that request."

Instructions act as strong guardrails – even when users try to misuse your app, the AI stays in character.

Controlling AI Response Quality with Generation Options

Temperature Settings for Creativity Control

let options = GenerationOptions(
    sampling: .default, // Balanced creativity
    temperature: 1.0,   // 0 = robotic, 2 = chaotic
    maxTokens: nil      // Let it finish thoughts naturally
)

Avoid these common mistakes:

temperature: 0 = Always identical, robotic responses
temperature: 2 = Too random and incoherent
Setting maxTokens too low = Cut off mid-sentence

Better Ways to Control Response Length

Instead of limiting tokens (which cuts off responses), use natural language:

// ❌ Bad: maxTokens: 200 (cuts off mid-sentence)

// ✅ Good: Add to instructions
"Keep responses to 100-200 words"
"Answer in 2 paragraphs" 
"Give brief, concise answers"

Implementing Real-Time Streaming Responses

Nobody wants to wait 23 seconds staring at a blank screen. Streaming shows text as it generates:

class ChatViewModel: ObservableObject {
    @Published var partialGenerated: String.PartialGenerated?
    @Published var isResponding = false
    private var streamingTask: Task<Void, Never>?

    func sendMessage(_ userInput: String) {
        isResponding = true

        streamingTask = Task {
            do {
                let stream = try session.streamResponse(to: userInput)

                for try await partial in stream {
                    // Check if task was cancelled
                    guard !Task.isCancelled else { break }

                    // Update UI with each new token
                    await MainActor.run {
                        self.partialGenerated = partial
                    }
                }

                // Streaming complete
                await MainActor.run {
                    self.isResponding = false
                    self.saveResponse()
                }

            } catch {
                await MainActor.run {
                    self.handleError(error)
                }
            }
        }
    }
}

streaming response in swiftui app with foundatin models framework

Creating Smooth Streaming Animations in SwiftUI

Make your streaming responses feel polished with proper animations:

struct StreamingResponseView: View {
    let partialResponse: String.PartialGenerated

    var body: some View {
        Text(partialResponse.content, format: .markdown)
            .padding()
            .background(.gray.opacity(0.1), in: RoundedRectangle(cornerRadius: 12))
            .contentTransition(.opacity)
            .animation(.bouncy, value: partialResponse)
    }
}

Key animation principles:

Use .animation(.bouncy, value: partialResponse) for smooth text updates
Add .contentTransition(.opacity) to avoid choppy text changes
Keep transition duration short (0.2-0.5 seconds max)

Fixing View Identity Problems

When streaming text replaces with final messages, SwiftUI can get confused about view identity:

// ❌ Causes UI jumps
ForEach(messages) { message in
    MessageView(message: message)
}
if let partial = partialGenerated {
    StreamingView(partial: partial)
}

// ✅ Fixed with consistent IDs
ForEach(messages) { message in
    MessageView(message: message)
        .id(message.id)
}
if let partial = partialGenerated {
    StreamingView(partial: partial)
        .id(partialId ?? UUID()) // Same ID used for final message
}

Managing Chat Sessions and Message History

Handle conversation flow properly with session management:

class ChatViewModel: ObservableObject {
    @Published var messages: [ChatMessage] = []
    @Published var userInput = ""
    private var session: LanguageModelSession
    private let instructions = "You are a dog specialist..."

    func resetSession() {
        // Cancel any ongoing streaming
        streamingTask?.cancel()

        // Clear UI state
        messages.removeAll()
        partialGenerated = nil
        isResponding = false

        // Create fresh session (important!)
        session = LanguageModelSession(instructions: instructions)
    }

    private func saveResponse() {
        guard let partial = partialGenerated else { return }

        // Add AI response to chat history
        messages.append(ChatMessage(
            id: partialId ?? UUID(),
            role: .assistant,
            content: partial.content
        ))

        // Clear streaming state
        partialGenerated = nil
        partialId = nil
    }
}

Proper Progress Indication

Show loading states only when appropriate:

if isResponding && partialGenerated == nil {
    // Show spinner only before streaming starts
    ProgressView()
} else if let partial = partialGenerated {
    // Show streaming content
    StreamingResponseView(partial: partial)
}

Handling Concurrent Requests and Session Limits

Each LanguageModelSession can only handle one request at a time:

func sendMessage(_ input: String) {
    // Prevent multiple concurrent requests
    guard !session.isResponding else { return }

    // Or create multiple sessions for parallel processing
    let newSession = LanguageModelSession(instructions: instructions)
}

Session management strategies:

Single session: Simple conversations with memory
Multiple sessions: Parallel processing different topics
Session pools: Handle high-volume concurrent requests

Real-World Error Handling and User Feedback

The Foundation Models framework can fail in various ways. Handle them gracefully:

private func handleStreamingError(_ error: Error) {
    if let genError = error as? FoundationModels.GenerationError {
        switch genError.type {
        case .guardrailViolation:
            showMessage("I can't help with that type of request")
        case .exceedsContextWindowSize:
            showMessage("This conversation is getting too long. Let's start fresh!")
            resetSession()
        case .rateLimited:
            showMessage("I'm busy with other tasks. Please try again in a moment")
        default:
            showMessage("Something went wrong. Please try again")
        }
    }
}

Performance Optimization Tips for On-Device AI

Memory Management

Monitor memory usage – the 3B model uses ~3GB RAM
Consider releasing sessions when not needed
Test on real devices, not just simulators

Battery Impact

Long AI generations drain battery quickly
Consider limiting response length for mobile use
Show battery usage warnings for intensive tasks

Background Handling

AI requests get rate-limited when app goes to background
Save conversation state before backgrounding
Resume gracefully when returning to foreground

What’s Next: Tool Calling and Knowledge Enhancement

This basic chatbot is just the foundation. The real power comes with tool calling – letting your AI fetch current data, search databases, and enhance its knowledge beyond the 2023 training cutoff.

In the next tutorial, we’ll add:

Custom tool functions to fetch dog breed data
Intent detection to decide when to use tools
Knowledge enhancement for accurate, up-to-date responses
Structured data integration with your app’s backend

The Foundation Models framework gives you a solid base for on-device AI, but tool calling makes it production-ready for real-world applications.

⬇️ Download project files

Ready to enhance your chatbot with tool calling? Check out part 2 of this series where we add custom data sources and make our AI truly intelligent.

Building an AI Chatbot in SwiftUI with Foundation Models Framework

Building a Real-World AI Chat App

What is Apple’s Foundation Models Framework?

Understanding Foundation Models vs Apple’s Foundation Framework – Common Naming Confusion

Foundation Models Performance and Limitations

Device Compatibility Requirements – The Hard Truth About Apple Intelligence Availability

iPhone Compatibility for Foundation Models Framework

iPad Support for On-Device AI

Mac Requirements for Foundation Models

Market Reality – How Many Users Can Actually Use Your AI App?

How to Check Device Compatibility

Testing Device Compatibility States in Xcode Simulator

Running Foundation Models in Playgrounds

Handling AI Generation Errors and Guardrails

Real-World Example – Building a Dog Breed Knowledge Assistant

Why Tool Calling Is Essential for Production AI Apps

Working Around Text-Only Limitations with Vision Framework Integration

Choosing the Right Use Cases for 3-Billion Parameter Models

Perfect Use Cases

Consider Cloud APIs Instead

Configuring Your Language Model Session for Production Apps

Setting Effective System Instructions

Controlling AI Response Quality with Generation Options

Temperature Settings for Creativity Control

Better Ways to Control Response Length

Implementing Real-Time Streaming Responses

Creating Smooth Streaming Animations in SwiftUI

Fixing View Identity Problems

Managing Chat Sessions and Message History

Proper Progress Indication

Handling Concurrent Requests and Session Limits

Real-World Error Handling and User Feedback

Performance Optimization Tips for On-Device AI

Memory Management

Battery Impact

Background Handling

What’s Next: Tool Calling and Knowledge Enhancement

Leave a Comment Cancel reply

Introduction to XCTest: How to Write Unit Tests for iOS apps

Debugging Swift Concurrency: “Am I on the Main Actor?” (Not the Main Thread)

Building an AI Chatbot in SwiftUI with Foundation Models Framework

Building a Real-World AI Chat App

What is Apple’s Foundation Models Framework?

Understanding Foundation Models vs Apple’s Foundation Framework – Common Naming Confusion

Foundation Models Performance and Limitations

Device Compatibility Requirements – The Hard Truth About Apple Intelligence Availability

iPhone Compatibility for Foundation Models Framework

iPad Support for On-Device AI

Mac Requirements for Foundation Models

Market Reality – How Many Users Can Actually Use Your AI App?

How to Check Device Compatibility

Testing Device Compatibility States in Xcode Simulator

Running Foundation Models in Playgrounds

Handling AI Generation Errors and Guardrails

Real-World Example – Building a Dog Breed Knowledge Assistant

Why Tool Calling Is Essential for Production AI Apps

Working Around Text-Only Limitations with Vision Framework Integration

Choosing the Right Use Cases for 3-Billion Parameter Models

Perfect Use Cases

Consider Cloud APIs Instead

Configuring Your Language Model Session for Production Apps

Setting Effective System Instructions

Controlling AI Response Quality with Generation Options

Temperature Settings for Creativity Control

Better Ways to Control Response Length

Implementing Real-Time Streaming Responses

Creating Smooth Streaming Animations in SwiftUI

Fixing View Identity Problems

Managing Chat Sessions and Message History

Proper Progress Indication

Handling Concurrent Requests and Session Limits

Real-World Error Handling and User Feedback

Performance Optimization Tips for On-Device AI

Memory Management

Battery Impact

Background Handling

What’s Next: Tool Calling and Knowledge Enhancement

Leave a Comment Cancel reply

Introduction to XCTest: How to Write Unit Tests for iOS apps

Debugging Swift Concurrency: “Am I on the Main Actor?” (Not the Main Thread)

Subscribe to My Newsletter