Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

Expert Guide to Building Offline-Capable AI Applications with Service Workers

I’ve built AI applications that work offline, and I can tell you: it’s not just about caching—it’s about rethinking how AI applications work. When users lose connectivity, they shouldn’t lose their work. When they’re on slow networks, they shouldn’t wait forever. PWAs make AI applications resilient, fast, and reliable.

In this guide, I’ll show you how to build offline-first AI applications using Progressive Web App technologies. You’ll learn service worker patterns, caching strategies, and how to handle AI workloads when connectivity is unreliable.

What You’ll Learn

Service worker patterns for AI applications
Caching strategies for LLM responses and models
Offline-first architecture patterns
Background sync for queued AI requests
Edge AI integration with service workers
Push notifications for AI completions
Real-world examples from production PWAs
Common pitfalls and how to avoid them

Introduction: Why PWAs for AI?

Traditional AI applications are completely dependent on network connectivity. Lose connection, lose functionality. But users expect more. They want AI applications that work on planes, in tunnels, and on unreliable networks.

Progressive Web Apps solve this. With service workers, caching, and background sync, you can build AI applications that work offline, sync when online, and feel native. I’ve built several offline-capable AI apps, and the user feedback has been overwhelmingly positive.

Key benefits of PWAs for AI:

Offline functionality: Continue working without connectivity
Faster load times: Cached assets load instantly
Better UX: Native app-like experience
Background sync: Queue requests, sync when online
Push notifications: Notify users when AI tasks complete

Figure 1: PWA Architecture for AI Applications

1. Understanding Service Workers for AI

1.1 What Service Workers Do

Service workers are JavaScript files that run in the background, separate from your main application. They can intercept network requests, cache responses, and work offline.

For AI applications, service workers enable:

Caching of AI responses for offline access
Queueing requests when offline
Background sync when connectivity returns
Push notifications for AI completions

1.2 Service Worker Lifecycle

Understanding the lifecycle is crucial:

// Service worker registration
if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/sw.js')
    .then((registration) => {
      console.log('SW registered:', registration);
    })
    .catch((error) => {
      console.error('SW registration failed:', error);
    });
}

// Service worker file (sw.js)
self.addEventListener('install', (event) => {
  // Cache essential assets
  event.waitUntil(
    caches.open('ai-app-v1').then((cache) => {
      return cache.addAll([
        '/',
        '/index.html',
        '/app.js',
        '/styles.css',
      ]);
    })
  );
  self.skipWaiting(); // Activate immediately
});

self.addEventListener('activate', (event) => {
  // Clean up old caches
  event.waitUntil(
    caches.keys().then((cacheNames) => {
      return Promise.all(
        cacheNames
          .filter((name) => name !== 'ai-app-v1')
          .map((name) => caches.delete(name))
      );
    })
  );
  return self.clients.claim();
});

2. Caching Strategies for AI Applications

2.1 Cache-First for Static Assets

For static assets (HTML, CSS, JS), use cache-first:

self.addEventListener('fetch', (event) => {
  if (event.request.destination === 'script' || 
      event.request.destination === 'style') {
    event.respondWith(
      caches.match(event.request).then((response) => {
        return response || fetch(event.request).then((fetchResponse) => {
          return caches.open('ai-app-v1').then((cache) => {
            cache.put(event.request, fetchResponse.clone());
            return fetchResponse;
          });
        });
      })
    );
  }
});

2.2 Network-First for AI Responses

For AI responses, try network first, fall back to cache:

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/chat')) {
    event.respondWith(
      fetch(event.request)
        .then((response) => {
          // Cache successful responses
          if (response.ok) {
            const clone = response.clone();
            caches.open('ai-responses-v1').then((cache) => {
              cache.put(event.request, clone);
            });
          }
          return response;
        })
        .catch(() => {
          // Fallback to cache if network fails
          return caches.match(event.request).then((cachedResponse) => {
            if (cachedResponse) {
              return cachedResponse;
            }
            // Return offline response
            return new Response(
              JSON.stringify({ error: 'Offline', message: 'Cached response unavailable' }),
              { headers: { 'Content-Type': 'application/json' } }
            );
          });
        })
    );
  }
});

2.3 Stale-While-Revalidate for Conversation History

For conversation history, show cached data immediately, then update:

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/conversations')) {
    event.respondWith(
      caches.open('conversations-v1').then((cache) => {
        return cache.match(event.request).then((cachedResponse) => {
          const fetchPromise = fetch(event.request).then((networkResponse) => {
            cache.put(event.request, networkResponse.clone());
            return networkResponse;
          });
          // Return cached immediately, update in background
          return cachedResponse || fetchPromise;
        });
      })
    );
  }
});

Figure 2: Caching Strategies for AI Applications

3. Offline-First Architecture

3.1 Queue Requests When Offline

Queue AI requests when offline, sync when online:

// In your main app
async function sendAIRequest(prompt) {
  if (!navigator.onLine) {
    // Queue request for background sync
    await queueRequest({ prompt, timestamp: Date.now() });
    return { queued: true, message: 'Request queued for when online' };
  }
  
  // Normal request
  return fetch('/api/chat', {
    method: 'POST',
    body: JSON.stringify({ prompt }),
  }).then((res) => res.json());
}

// IndexedDB for queue
const db = await openDB('ai-queue', 1, {
  upgrade(db) {
    db.createObjectStore('requests', { keyPath: 'id', autoIncrement: true });
  },
});

async function queueRequest(request) {
  const tx = db.transaction('requests', 'readwrite');
  await tx.store.add(request);
  await tx.done;
}

3.2 Background Sync

Sync queued requests when connectivity returns:

// In service worker
self.addEventListener('sync', (event) => {
  if (event.tag === 'sync-ai-requests') {
    event.waitUntil(syncQueuedRequests());
  }
});

async function syncQueuedRequests() {
  const db = await openDB('ai-queue');
  const tx = db.transaction('requests', 'readonly');
  const requests = await tx.store.getAll();
  
  for (const request of requests) {
    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ prompt: request.prompt }),
      });
      
      if (response.ok) {
        // Remove from queue
        const deleteTx = db.transaction('requests', 'readwrite');
        await deleteTx.store.delete(request.id);
        await deleteTx.done;
        
        // Notify user
        self.registration.showNotification('AI Response Ready', {
          body: 'Your queued request has been processed',
        });
      }
    } catch (error) {
      console.error('Sync failed:', error);
    }
  }
}

// Register background sync
navigator.serviceWorker.ready.then((registration) => {
  return registration.sync.register('sync-ai-requests');
});

4. Edge AI Integration

4.1 Running Models in Service Workers

Service workers can run lightweight AI models for offline inference:

// Load model in service worker
let model = null;

self.addEventListener('message', async (event) => {
  if (event.data.type === 'LOAD_MODEL') {
    // Load a lightweight model (e.g., TensorFlow.js)
    model = await tf.loadLayersModel('/models/lightweight-model.json');
    event.ports[0].postMessage({ type: 'MODEL_LOADED' });
  }
  
  if (event.data.type === 'INFER' && model) {
    const input = event.data.input;
    const prediction = model.predict(input);
    event.ports[0].postMessage({ 
      type: 'PREDICTION', 
      result: await prediction.data() 
    });
  }
});

// In main app
const worker = new Worker('/sw.js');
worker.postMessage({ type: 'LOAD_MODEL' });

worker.onmessage = (event) => {
  if (event.data.type === 'PREDICTION') {
    // Use prediction
    updateUI(event.data.result);
  }
};

4.2 Hybrid Approach

Use edge AI for simple tasks, cloud AI for complex ones:

async function processAIRequest(prompt) {
  // Check if offline
  if (!navigator.onLine) {
    // Use edge AI model
    return processWithEdgeAI(prompt);
  }
  
  // Check prompt complexity
  if (isSimplePrompt(prompt)) {
    // Use edge AI (faster, free)
    return processWithEdgeAI(prompt);
  } else {
    // Use cloud AI (more capable)
    return processWithCloudAI(prompt);
  }
}

function isSimplePrompt(prompt) {
  // Simple heuristics
  return prompt.length < 100 && 
         !prompt.includes('analyze') && 
         !prompt.includes('generate');
}

Figure 3: Offline-First Patterns and Background Sync

5. Push Notifications for AI Completions

5.1 Requesting Permission

Request notification permission and subscribe to push:

async function requestNotificationPermission() {
  const permission = await Notification.requestPermission();
  if (permission === 'granted') {
    const registration = await navigator.serviceWorker.ready;
    const subscription = await registration.pushManager.subscribe({
      userVisibleOnly: true,
      applicationServerKey: urlBase64ToUint8Array(VAPID_PUBLIC_KEY),
    });
    
    // Send subscription to server
    await fetch('/api/push/subscribe', {
      method: 'POST',
      body: JSON.stringify(subscription),
    });
  }
}

// In service worker
self.addEventListener('push', (event) => {
  const data = event.data.json();
  const options = {
    body: data.message,
    icon: '/icon-192.png',
    badge: '/badge-72.png',
    data: data.url,
  };
  
  event.waitUntil(
    self.registration.showNotification(data.title, options)
  );
});

self.addEventListener('notificationclick', (event) => {
  event.notification.close();
  event.waitUntil(
    clients.openWindow(event.notification.data || '/')
  );
});

6. Complete PWA Implementation

Here’s a complete, production-ready PWA setup for AI applications:

// manifest.json
{
  "name": "AI Chat PWA",
  "short_name": "AI Chat",
  "description": "Offline-capable AI chat application",
  "start_url": "/",
  "display": "standalone",
  "background_color": "#ffffff",
  "theme_color": "#667eea",
  "icons": [
    {
      "src": "/icon-192.png",
      "sizes": "192x192",
      "type": "image/png"
    },
    {
      "src": "/icon-512.png",
      "sizes": "512x512",
      "type": "image/png"
    }
  ]
}

// service-worker.js (complete)
const CACHE_NAME = 'ai-app-v1';
const API_CACHE = 'ai-api-v1';

// Install: Cache static assets
self.addEventListener('install', (event) => {
  event.waitUntil(
    caches.open(CACHE_NAME).then((cache) => {
      return cache.addAll([
        '/',
        '/index.html',
        '/app.js',
        '/styles.css',
      ]);
    })
  );
  self.skipWaiting();
});

// Activate: Clean old caches
self.addEventListener('activate', (event) => {
  event.waitUntil(
    caches.keys().then((cacheNames) => {
      return Promise.all(
        cacheNames
          .filter((name) => name !== CACHE_NAME && name !== API_CACHE)
          .map((name) => caches.delete(name))
      );
    })
  );
  return self.clients.claim();
});

// Fetch: Network-first for API, cache-first for assets
self.addEventListener('fetch', (event) => {
  const { request } = event;
  
  // API requests: Network-first
  if (request.url.includes('/api/')) {
    event.respondWith(
      fetch(request)
        .then((response) => {
          if (response.ok) {
            const clone = response.clone();
            caches.open(API_CACHE).then((cache) => {
              cache.put(request, clone);
            });
          }
          return response;
        })
        .catch(() => {
          return caches.match(request).then((cached) => {
            return cached || new Response(
              JSON.stringify({ error: 'Offline' }),
              { headers: { 'Content-Type': 'application/json' } }
            );
          });
        })
    );
  } else {
    // Static assets: Cache-first
    event.respondWith(
      caches.match(request).then((cached) => {
        return cached || fetch(request).then((response) => {
          return caches.open(CACHE_NAME).then((cache) => {
            cache.put(request, response.clone());
            return response;
          });
        });
      })
    );
  }
});

// Background sync
self.addEventListener('sync', (event) => {
  if (event.tag === 'sync-requests') {
    event.waitUntil(syncQueuedRequests());
  }
});

// Push notifications
self.addEventListener('push', (event) => {
  const data = event.data.json();
  event.waitUntil(
    self.registration.showNotification(data.title, {
      body: data.body,
      icon: '/icon-192.png',
    })
  );
});

7. Best Practices: Lessons from Production

After building multiple PWA AI applications, here are the practices I follow:

Cache strategically: Don’t cache everything—cache what matters
Use network-first for AI: Always try network first, cache as fallback
Queue requests intelligently: Only queue what makes sense offline
Implement background sync: Sync queued requests automatically
Show offline indicators: Users need to know when they’re offline
Use IndexedDB for queues: More reliable than localStorage
Test offline scenarios: Test with network throttling and offline mode
Update service workers carefully: Don’t break existing users
Monitor cache sizes: Don’t exceed storage quotas
Provide offline fallbacks: Show cached data or helpful messages

8. Common Mistakes to Avoid

I’ve made these mistakes so you don’t have to:

Caching everything: Don’t cache large AI models or responses unnecessarily
Ignoring cache limits: Browsers have storage quotas—respect them
Not handling service worker updates: Updates can break functionality
Forgetting offline indicators: Users need feedback about connectivity
Not testing offline: Offline behavior is different—test it
Queueing everything: Some requests don’t make sense offline
Ignoring push notification permissions: Request permission at the right time
Not cleaning old caches: Old caches waste storage

9. Conclusion

Progressive Web Apps transform AI applications from network-dependent to resilient, offline-capable experiences. With service workers, caching, and background sync, you can build AI applications that work anywhere, anytime.

The key is strategic caching, intelligent queuing, and graceful offline handling. Get these right, and your AI application will feel native, fast, and reliable—even when connectivity is poor.

🎯 Key Takeaway

PWAs make AI applications resilient. With service workers, you can cache responses, queue requests, and sync in the background. The result: AI applications that work offline, load faster, and feel native. Strategic caching and intelligent queuing are the keys to great offline AI experiences.

Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Searching in

Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

What You’ll Learn

Introduction: Why PWAs for AI?

1. Understanding Service Workers for AI

1.1 What Service Workers Do

1.2 Service Worker Lifecycle

2. Caching Strategies for AI Applications

2.1 Cache-First for Static Assets

2.2 Network-First for AI Responses

2.3 Stale-While-Revalidate for Conversation History

3. Offline-First Architecture

3.1 Queue Requests When Offline

3.2 Background Sync

4. Edge AI Integration

4.1 Running Models in Service Workers

4.2 Hybrid Approach

5. Push Notifications for AI Completions

5.1 Requesting Permission

6. Complete PWA Implementation

7. Best Practices: Lessons from Production

8. Common Mistakes to Avoid

9. Conclusion

🎯 Key Takeaway

Discover more from C4: Container, Code, Cloud & Context

Leave a Reply

Searching in

Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

What You’ll Learn

Introduction: Why PWAs for AI?

1. Understanding Service Workers for AI

1.1 What Service Workers Do

1.2 Service Worker Lifecycle

2. Caching Strategies for AI Applications

2.1 Cache-First for Static Assets

2.2 Network-First for AI Responses

2.3 Stale-While-Revalidate for Conversation History

3. Offline-First Architecture

3.1 Queue Requests When Offline

3.2 Background Sync

4. Edge AI Integration

4.1 Running Models in Service Workers

4.2 Hybrid Approach

5. Push Notifications for AI Completions

5.1 Requesting Permission

6. Complete PWA Implementation

7. Best Practices: Lessons from Production

8. Common Mistakes to Avoid

9. Conclusion

🎯 Key Takeaway

Share this article

Discover more from C4: Container, Code, Cloud & Context

Leave a Reply