Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications
Expert Guide to Building Offline-Capable AI Applications with Service Workers
I’ve built AI applications that work offline, and I can tell you: it’s not just about caching—it’s about rethinking how AI applications work. When users lose connectivity, they shouldn’t lose their work. When they’re on slow networks, they shouldn’t wait forever. PWAs make AI applications resilient, fast, and reliable.
In this guide, I’ll show you how to build offline-first AI applications using Progressive Web App technologies. You’ll learn service worker patterns, caching strategies, and how to handle AI workloads when connectivity is unreliable.
What You’ll Learn
- Service worker patterns for AI applications
- Caching strategies for LLM responses and models
- Offline-first architecture patterns
- Background sync for queued AI requests
- Edge AI integration with service workers
- Push notifications for AI completions
- Real-world examples from production PWAs
- Common pitfalls and how to avoid them
Introduction: Why PWAs for AI?
Traditional AI applications are completely dependent on network connectivity. Lose connection, lose functionality. But users expect more. They want AI applications that work on planes, in tunnels, and on unreliable networks.
Progressive Web Apps solve this. With service workers, caching, and background sync, you can build AI applications that work offline, sync when online, and feel native. I’ve built several offline-capable AI apps, and the user feedback has been overwhelmingly positive.
Key benefits of PWAs for AI:
- Offline functionality: Continue working without connectivity
- Faster load times: Cached assets load instantly
- Better UX: Native app-like experience
- Background sync: Queue requests, sync when online
- Push notifications: Notify users when AI tasks complete

1. Understanding Service Workers for AI
1.1 What Service Workers Do
Service workers are JavaScript files that run in the background, separate from your main application. They can intercept network requests, cache responses, and work offline.
For AI applications, service workers enable:
- Caching of AI responses for offline access
- Queueing requests when offline
- Background sync when connectivity returns
- Push notifications for AI completions
1.2 Service Worker Lifecycle
Understanding the lifecycle is crucial:
// Service worker registration
if ('serviceWorker' in navigator) {
navigator.serviceWorker.register('/sw.js')
.then((registration) => {
console.log('SW registered:', registration);
})
.catch((error) => {
console.error('SW registration failed:', error);
});
}
// Service worker file (sw.js)
self.addEventListener('install', (event) => {
// Cache essential assets
event.waitUntil(
caches.open('ai-app-v1').then((cache) => {
return cache.addAll([
'/',
'/index.html',
'/app.js',
'/styles.css',
]);
})
);
self.skipWaiting(); // Activate immediately
});
self.addEventListener('activate', (event) => {
// Clean up old caches
event.waitUntil(
caches.keys().then((cacheNames) => {
return Promise.all(
cacheNames
.filter((name) => name !== 'ai-app-v1')
.map((name) => caches.delete(name))
);
})
);
return self.clients.claim();
});
2. Caching Strategies for AI Applications
2.1 Cache-First for Static Assets
For static assets (HTML, CSS, JS), use cache-first:
self.addEventListener('fetch', (event) => {
if (event.request.destination === 'script' ||
event.request.destination === 'style') {
event.respondWith(
caches.match(event.request).then((response) => {
return response || fetch(event.request).then((fetchResponse) => {
return caches.open('ai-app-v1').then((cache) => {
cache.put(event.request, fetchResponse.clone());
return fetchResponse;
});
});
})
);
}
});
2.2 Network-First for AI Responses
For AI responses, try network first, fall back to cache:
self.addEventListener('fetch', (event) => {
if (event.request.url.includes('/api/chat')) {
event.respondWith(
fetch(event.request)
.then((response) => {
// Cache successful responses
if (response.ok) {
const clone = response.clone();
caches.open('ai-responses-v1').then((cache) => {
cache.put(event.request, clone);
});
}
return response;
})
.catch(() => {
// Fallback to cache if network fails
return caches.match(event.request).then((cachedResponse) => {
if (cachedResponse) {
return cachedResponse;
}
// Return offline response
return new Response(
JSON.stringify({ error: 'Offline', message: 'Cached response unavailable' }),
{ headers: { 'Content-Type': 'application/json' } }
);
});
})
);
}
});
2.3 Stale-While-Revalidate for Conversation History
For conversation history, show cached data immediately, then update:
self.addEventListener('fetch', (event) => {
if (event.request.url.includes('/api/conversations')) {
event.respondWith(
caches.open('conversations-v1').then((cache) => {
return cache.match(event.request).then((cachedResponse) => {
const fetchPromise = fetch(event.request).then((networkResponse) => {
cache.put(event.request, networkResponse.clone());
return networkResponse;
});
// Return cached immediately, update in background
return cachedResponse || fetchPromise;
});
})
);
}
});

3. Offline-First Architecture
3.1 Queue Requests When Offline
Queue AI requests when offline, sync when online:
// In your main app
async function sendAIRequest(prompt) {
if (!navigator.onLine) {
// Queue request for background sync
await queueRequest({ prompt, timestamp: Date.now() });
return { queued: true, message: 'Request queued for when online' };
}
// Normal request
return fetch('/api/chat', {
method: 'POST',
body: JSON.stringify({ prompt }),
}).then((res) => res.json());
}
// IndexedDB for queue
const db = await openDB('ai-queue', 1, {
upgrade(db) {
db.createObjectStore('requests', { keyPath: 'id', autoIncrement: true });
},
});
async function queueRequest(request) {
const tx = db.transaction('requests', 'readwrite');
await tx.store.add(request);
await tx.done;
}
3.2 Background Sync
Sync queued requests when connectivity returns:
// In service worker
self.addEventListener('sync', (event) => {
if (event.tag === 'sync-ai-requests') {
event.waitUntil(syncQueuedRequests());
}
});
async function syncQueuedRequests() {
const db = await openDB('ai-queue');
const tx = db.transaction('requests', 'readonly');
const requests = await tx.store.getAll();
for (const request of requests) {
try {
const response = await fetch('/api/chat', {
method: 'POST',
body: JSON.stringify({ prompt: request.prompt }),
});
if (response.ok) {
// Remove from queue
const deleteTx = db.transaction('requests', 'readwrite');
await deleteTx.store.delete(request.id);
await deleteTx.done;
// Notify user
self.registration.showNotification('AI Response Ready', {
body: 'Your queued request has been processed',
});
}
} catch (error) {
console.error('Sync failed:', error);
}
}
}
// Register background sync
navigator.serviceWorker.ready.then((registration) => {
return registration.sync.register('sync-ai-requests');
});
4. Edge AI Integration
4.1 Running Models in Service Workers
Service workers can run lightweight AI models for offline inference:
// Load model in service worker
let model = null;
self.addEventListener('message', async (event) => {
if (event.data.type === 'LOAD_MODEL') {
// Load a lightweight model (e.g., TensorFlow.js)
model = await tf.loadLayersModel('/models/lightweight-model.json');
event.ports[0].postMessage({ type: 'MODEL_LOADED' });
}
if (event.data.type === 'INFER' && model) {
const input = event.data.input;
const prediction = model.predict(input);
event.ports[0].postMessage({
type: 'PREDICTION',
result: await prediction.data()
});
}
});
// In main app
const worker = new Worker('/sw.js');
worker.postMessage({ type: 'LOAD_MODEL' });
worker.onmessage = (event) => {
if (event.data.type === 'PREDICTION') {
// Use prediction
updateUI(event.data.result);
}
};
4.2 Hybrid Approach
Use edge AI for simple tasks, cloud AI for complex ones:
async function processAIRequest(prompt) {
// Check if offline
if (!navigator.onLine) {
// Use edge AI model
return processWithEdgeAI(prompt);
}
// Check prompt complexity
if (isSimplePrompt(prompt)) {
// Use edge AI (faster, free)
return processWithEdgeAI(prompt);
} else {
// Use cloud AI (more capable)
return processWithCloudAI(prompt);
}
}
function isSimplePrompt(prompt) {
// Simple heuristics
return prompt.length < 100 &&
!prompt.includes('analyze') &&
!prompt.includes('generate');
}

5. Push Notifications for AI Completions
5.1 Requesting Permission
Request notification permission and subscribe to push:
async function requestNotificationPermission() {
const permission = await Notification.requestPermission();
if (permission === 'granted') {
const registration = await navigator.serviceWorker.ready;
const subscription = await registration.pushManager.subscribe({
userVisibleOnly: true,
applicationServerKey: urlBase64ToUint8Array(VAPID_PUBLIC_KEY),
});
// Send subscription to server
await fetch('/api/push/subscribe', {
method: 'POST',
body: JSON.stringify(subscription),
});
}
}
// In service worker
self.addEventListener('push', (event) => {
const data = event.data.json();
const options = {
body: data.message,
icon: '/icon-192.png',
badge: '/badge-72.png',
data: data.url,
};
event.waitUntil(
self.registration.showNotification(data.title, options)
);
});
self.addEventListener('notificationclick', (event) => {
event.notification.close();
event.waitUntil(
clients.openWindow(event.notification.data || '/')
);
});
6. Complete PWA Implementation
Here’s a complete, production-ready PWA setup for AI applications:
// manifest.json
{
"name": "AI Chat PWA",
"short_name": "AI Chat",
"description": "Offline-capable AI chat application",
"start_url": "/",
"display": "standalone",
"background_color": "#ffffff",
"theme_color": "#667eea",
"icons": [
{
"src": "/icon-192.png",
"sizes": "192x192",
"type": "image/png"
},
{
"src": "/icon-512.png",
"sizes": "512x512",
"type": "image/png"
}
]
}
// service-worker.js (complete)
const CACHE_NAME = 'ai-app-v1';
const API_CACHE = 'ai-api-v1';
// Install: Cache static assets
self.addEventListener('install', (event) => {
event.waitUntil(
caches.open(CACHE_NAME).then((cache) => {
return cache.addAll([
'/',
'/index.html',
'/app.js',
'/styles.css',
]);
})
);
self.skipWaiting();
});
// Activate: Clean old caches
self.addEventListener('activate', (event) => {
event.waitUntil(
caches.keys().then((cacheNames) => {
return Promise.all(
cacheNames
.filter((name) => name !== CACHE_NAME && name !== API_CACHE)
.map((name) => caches.delete(name))
);
})
);
return self.clients.claim();
});
// Fetch: Network-first for API, cache-first for assets
self.addEventListener('fetch', (event) => {
const { request } = event;
// API requests: Network-first
if (request.url.includes('/api/')) {
event.respondWith(
fetch(request)
.then((response) => {
if (response.ok) {
const clone = response.clone();
caches.open(API_CACHE).then((cache) => {
cache.put(request, clone);
});
}
return response;
})
.catch(() => {
return caches.match(request).then((cached) => {
return cached || new Response(
JSON.stringify({ error: 'Offline' }),
{ headers: { 'Content-Type': 'application/json' } }
);
});
})
);
} else {
// Static assets: Cache-first
event.respondWith(
caches.match(request).then((cached) => {
return cached || fetch(request).then((response) => {
return caches.open(CACHE_NAME).then((cache) => {
cache.put(request, response.clone());
return response;
});
});
})
);
}
});
// Background sync
self.addEventListener('sync', (event) => {
if (event.tag === 'sync-requests') {
event.waitUntil(syncQueuedRequests());
}
});
// Push notifications
self.addEventListener('push', (event) => {
const data = event.data.json();
event.waitUntil(
self.registration.showNotification(data.title, {
body: data.body,
icon: '/icon-192.png',
})
);
});

7. Best Practices: Lessons from Production
After building multiple PWA AI applications, here are the practices I follow:
- Cache strategically: Don’t cache everything—cache what matters
- Use network-first for AI: Always try network first, cache as fallback
- Queue requests intelligently: Only queue what makes sense offline
- Implement background sync: Sync queued requests automatically
- Show offline indicators: Users need to know when they’re offline
- Use IndexedDB for queues: More reliable than localStorage
- Test offline scenarios: Test with network throttling and offline mode
- Update service workers carefully: Don’t break existing users
- Monitor cache sizes: Don’t exceed storage quotas
- Provide offline fallbacks: Show cached data or helpful messages

8. Common Mistakes to Avoid
I’ve made these mistakes so you don’t have to:
- Caching everything: Don’t cache large AI models or responses unnecessarily
- Ignoring cache limits: Browsers have storage quotas—respect them
- Not handling service worker updates: Updates can break functionality
- Forgetting offline indicators: Users need feedback about connectivity
- Not testing offline: Offline behavior is different—test it
- Queueing everything: Some requests don’t make sense offline
- Ignoring push notification permissions: Request permission at the right time
- Not cleaning old caches: Old caches waste storage
9. Conclusion
Progressive Web Apps transform AI applications from network-dependent to resilient, offline-capable experiences. With service workers, caching, and background sync, you can build AI applications that work anywhere, anytime.
The key is strategic caching, intelligent queuing, and graceful offline handling. Get these right, and your AI application will feel native, fast, and reliable—even when connectivity is poor.
🎯 Key Takeaway
PWAs make AI applications resilient. With service workers, you can cache responses, queue requests, and sync in the background. The result: AI applications that work offline, load faster, and feel native. Strategic caching and intelligent queuing are the keys to great offline AI experiences.
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.