Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

Progressive Web Apps (PWAs) for AI: Offline-First LLM Applications

Expert Guide to Building Offline-Capable AI Applications with Service Workers

I’ve built AI applications that work offline, and I can tell you: it’s not just about caching—it’s about rethinking how AI applications work. When users lose connectivity, they shouldn’t lose their work. When they’re on slow networks, they shouldn’t wait forever. PWAs make AI applications resilient, fast, and reliable.

In this guide, I’ll show you how to build offline-first AI applications using Progressive Web App technologies. You’ll learn service worker patterns, caching strategies, and how to handle AI workloads when connectivity is unreliable.

What You’ll Learn

  • Service worker patterns for AI applications
  • Caching strategies for LLM responses and models
  • Offline-first architecture patterns
  • Background sync for queued AI requests
  • Edge AI integration with service workers
  • Push notifications for AI completions
  • Real-world examples from production PWAs
  • Common pitfalls and how to avoid them

Introduction: Why PWAs for AI?

Traditional AI applications are completely dependent on network connectivity. Lose connection, lose functionality. But users expect more. They want AI applications that work on planes, in tunnels, and on unreliable networks.

Progressive Web Apps solve this. With service workers, caching, and background sync, you can build AI applications that work offline, sync when online, and feel native. I’ve built several offline-capable AI apps, and the user feedback has been overwhelmingly positive.

Key benefits of PWAs for AI:

  • Offline functionality: Continue working without connectivity
  • Faster load times: Cached assets load instantly
  • Better UX: Native app-like experience
  • Background sync: Queue requests, sync when online
  • Push notifications: Notify users when AI tasks complete
PWA Architecture for AI Applications
Figure 1: PWA Architecture for AI Applications

1. Understanding Service Workers for AI

1.1 What Service Workers Do

Service workers are JavaScript files that run in the background, separate from your main application. They can intercept network requests, cache responses, and work offline.

For AI applications, service workers enable:

  • Caching of AI responses for offline access
  • Queueing requests when offline
  • Background sync when connectivity returns
  • Push notifications for AI completions

1.2 Service Worker Lifecycle

Understanding the lifecycle is crucial:

// Service worker registration
if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/sw.js')
    .then((registration) => {
      console.log('SW registered:', registration);
    })
    .catch((error) => {
      console.error('SW registration failed:', error);
    });
}

// Service worker file (sw.js)
self.addEventListener('install', (event) => {
  // Cache essential assets
  event.waitUntil(
    caches.open('ai-app-v1').then((cache) => {
      return cache.addAll([
        '/',
        '/index.html',
        '/app.js',
        '/styles.css',
      ]);
    })
  );
  self.skipWaiting(); // Activate immediately
});

self.addEventListener('activate', (event) => {
  // Clean up old caches
  event.waitUntil(
    caches.keys().then((cacheNames) => {
      return Promise.all(
        cacheNames
          .filter((name) => name !== 'ai-app-v1')
          .map((name) => caches.delete(name))
      );
    })
  );
  return self.clients.claim();
});

2. Caching Strategies for AI Applications

2.1 Cache-First for Static Assets

For static assets (HTML, CSS, JS), use cache-first:

self.addEventListener('fetch', (event) => {
  if (event.request.destination === 'script' || 
      event.request.destination === 'style') {
    event.respondWith(
      caches.match(event.request).then((response) => {
        return response || fetch(event.request).then((fetchResponse) => {
          return caches.open('ai-app-v1').then((cache) => {
            cache.put(event.request, fetchResponse.clone());
            return fetchResponse;
          });
        });
      })
    );
  }
});

2.2 Network-First for AI Responses

For AI responses, try network first, fall back to cache:

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/chat')) {
    event.respondWith(
      fetch(event.request)
        .then((response) => {
          // Cache successful responses
          if (response.ok) {
            const clone = response.clone();
            caches.open('ai-responses-v1').then((cache) => {
              cache.put(event.request, clone);
            });
          }
          return response;
        })
        .catch(() => {
          // Fallback to cache if network fails
          return caches.match(event.request).then((cachedResponse) => {
            if (cachedResponse) {
              return cachedResponse;
            }
            // Return offline response
            return new Response(
              JSON.stringify({ error: 'Offline', message: 'Cached response unavailable' }),
              { headers: { 'Content-Type': 'application/json' } }
            );
          });
        })
    );
  }
});

2.3 Stale-While-Revalidate for Conversation History

For conversation history, show cached data immediately, then update:

self.addEventListener('fetch', (event) => {
  if (event.request.url.includes('/api/conversations')) {
    event.respondWith(
      caches.open('conversations-v1').then((cache) => {
        return cache.match(event.request).then((cachedResponse) => {
          const fetchPromise = fetch(event.request).then((networkResponse) => {
            cache.put(event.request, networkResponse.clone());
            return networkResponse;
          });
          // Return cached immediately, update in background
          return cachedResponse || fetchPromise;
        });
      })
    );
  }
});
Caching Strategies for AI Applications
Figure 2: Caching Strategies for AI Applications

3. Offline-First Architecture

3.1 Queue Requests When Offline

Queue AI requests when offline, sync when online:

// In your main app
async function sendAIRequest(prompt) {
  if (!navigator.onLine) {
    // Queue request for background sync
    await queueRequest({ prompt, timestamp: Date.now() });
    return { queued: true, message: 'Request queued for when online' };
  }
  
  // Normal request
  return fetch('/api/chat', {
    method: 'POST',
    body: JSON.stringify({ prompt }),
  }).then((res) => res.json());
}

// IndexedDB for queue
const db = await openDB('ai-queue', 1, {
  upgrade(db) {
    db.createObjectStore('requests', { keyPath: 'id', autoIncrement: true });
  },
});

async function queueRequest(request) {
  const tx = db.transaction('requests', 'readwrite');
  await tx.store.add(request);
  await tx.done;
}

3.2 Background Sync

Sync queued requests when connectivity returns:

// In service worker
self.addEventListener('sync', (event) => {
  if (event.tag === 'sync-ai-requests') {
    event.waitUntil(syncQueuedRequests());
  }
});

async function syncQueuedRequests() {
  const db = await openDB('ai-queue');
  const tx = db.transaction('requests', 'readonly');
  const requests = await tx.store.getAll();
  
  for (const request of requests) {
    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        body: JSON.stringify({ prompt: request.prompt }),
      });
      
      if (response.ok) {
        // Remove from queue
        const deleteTx = db.transaction('requests', 'readwrite');
        await deleteTx.store.delete(request.id);
        await deleteTx.done;
        
        // Notify user
        self.registration.showNotification('AI Response Ready', {
          body: 'Your queued request has been processed',
        });
      }
    } catch (error) {
      console.error('Sync failed:', error);
    }
  }
}

// Register background sync
navigator.serviceWorker.ready.then((registration) => {
  return registration.sync.register('sync-ai-requests');
});

4. Edge AI Integration

4.1 Running Models in Service Workers

Service workers can run lightweight AI models for offline inference:

// Load model in service worker
let model = null;

self.addEventListener('message', async (event) => {
  if (event.data.type === 'LOAD_MODEL') {
    // Load a lightweight model (e.g., TensorFlow.js)
    model = await tf.loadLayersModel('/models/lightweight-model.json');
    event.ports[0].postMessage({ type: 'MODEL_LOADED' });
  }
  
  if (event.data.type === 'INFER' && model) {
    const input = event.data.input;
    const prediction = model.predict(input);
    event.ports[0].postMessage({ 
      type: 'PREDICTION', 
      result: await prediction.data() 
    });
  }
});

// In main app
const worker = new Worker('/sw.js');
worker.postMessage({ type: 'LOAD_MODEL' });

worker.onmessage = (event) => {
  if (event.data.type === 'PREDICTION') {
    // Use prediction
    updateUI(event.data.result);
  }
};

4.2 Hybrid Approach

Use edge AI for simple tasks, cloud AI for complex ones:

async function processAIRequest(prompt) {
  // Check if offline
  if (!navigator.onLine) {
    // Use edge AI model
    return processWithEdgeAI(prompt);
  }
  
  // Check prompt complexity
  if (isSimplePrompt(prompt)) {
    // Use edge AI (faster, free)
    return processWithEdgeAI(prompt);
  } else {
    // Use cloud AI (more capable)
    return processWithCloudAI(prompt);
  }
}

function isSimplePrompt(prompt) {
  // Simple heuristics
  return prompt.length < 100 && 
         !prompt.includes('analyze') && 
         !prompt.includes('generate');
}
Offline-First Patterns and Background Sync
Figure 3: Offline-First Patterns and Background Sync

5. Push Notifications for AI Completions

5.1 Requesting Permission

Request notification permission and subscribe to push:

async function requestNotificationPermission() {
  const permission = await Notification.requestPermission();
  if (permission === 'granted') {
    const registration = await navigator.serviceWorker.ready;
    const subscription = await registration.pushManager.subscribe({
      userVisibleOnly: true,
      applicationServerKey: urlBase64ToUint8Array(VAPID_PUBLIC_KEY),
    });
    
    // Send subscription to server
    await fetch('/api/push/subscribe', {
      method: 'POST',
      body: JSON.stringify(subscription),
    });
  }
}

// In service worker
self.addEventListener('push', (event) => {
  const data = event.data.json();
  const options = {
    body: data.message,
    icon: '/icon-192.png',
    badge: '/badge-72.png',
    data: data.url,
  };
  
  event.waitUntil(
    self.registration.showNotification(data.title, options)
  );
});

self.addEventListener('notificationclick', (event) => {
  event.notification.close();
  event.waitUntil(
    clients.openWindow(event.notification.data || '/')
  );
});

6. Complete PWA Implementation

Here’s a complete, production-ready PWA setup for AI applications:

// manifest.json
{
  "name": "AI Chat PWA",
  "short_name": "AI Chat",
  "description": "Offline-capable AI chat application",
  "start_url": "/",
  "display": "standalone",
  "background_color": "#ffffff",
  "theme_color": "#667eea",
  "icons": [
    {
      "src": "/icon-192.png",
      "sizes": "192x192",
      "type": "image/png"
    },
    {
      "src": "/icon-512.png",
      "sizes": "512x512",
      "type": "image/png"
    }
  ]
}

// service-worker.js (complete)
const CACHE_NAME = 'ai-app-v1';
const API_CACHE = 'ai-api-v1';

// Install: Cache static assets
self.addEventListener('install', (event) => {
  event.waitUntil(
    caches.open(CACHE_NAME).then((cache) => {
      return cache.addAll([
        '/',
        '/index.html',
        '/app.js',
        '/styles.css',
      ]);
    })
  );
  self.skipWaiting();
});

// Activate: Clean old caches
self.addEventListener('activate', (event) => {
  event.waitUntil(
    caches.keys().then((cacheNames) => {
      return Promise.all(
        cacheNames
          .filter((name) => name !== CACHE_NAME && name !== API_CACHE)
          .map((name) => caches.delete(name))
      );
    })
  );
  return self.clients.claim();
});

// Fetch: Network-first for API, cache-first for assets
self.addEventListener('fetch', (event) => {
  const { request } = event;
  
  // API requests: Network-first
  if (request.url.includes('/api/')) {
    event.respondWith(
      fetch(request)
        .then((response) => {
          if (response.ok) {
            const clone = response.clone();
            caches.open(API_CACHE).then((cache) => {
              cache.put(request, clone);
            });
          }
          return response;
        })
        .catch(() => {
          return caches.match(request).then((cached) => {
            return cached || new Response(
              JSON.stringify({ error: 'Offline' }),
              { headers: { 'Content-Type': 'application/json' } }
            );
          });
        })
    );
  } else {
    // Static assets: Cache-first
    event.respondWith(
      caches.match(request).then((cached) => {
        return cached || fetch(request).then((response) => {
          return caches.open(CACHE_NAME).then((cache) => {
            cache.put(request, response.clone());
            return response;
          });
        });
      })
    );
  }
});

// Background sync
self.addEventListener('sync', (event) => {
  if (event.tag === 'sync-requests') {
    event.waitUntil(syncQueuedRequests());
  }
});

// Push notifications
self.addEventListener('push', (event) => {
  const data = event.data.json();
  event.waitUntil(
    self.registration.showNotification(data.title, {
      body: data.body,
      icon: '/icon-192.png',
    })
  );
});
Best Practices: Lessons from Production
Best Practices: Lessons from Production

7. Best Practices: Lessons from Production

After building multiple PWA AI applications, here are the practices I follow:

  1. Cache strategically: Don’t cache everything—cache what matters
  2. Use network-first for AI: Always try network first, cache as fallback
  3. Queue requests intelligently: Only queue what makes sense offline
  4. Implement background sync: Sync queued requests automatically
  5. Show offline indicators: Users need to know when they’re offline
  6. Use IndexedDB for queues: More reliable than localStorage
  7. Test offline scenarios: Test with network throttling and offline mode
  8. Update service workers carefully: Don’t break existing users
  9. Monitor cache sizes: Don’t exceed storage quotas
  10. Provide offline fallbacks: Show cached data or helpful messages
Common Mistakes to Avoid
Common Mistakes to Avoid

8. Common Mistakes to Avoid

I’ve made these mistakes so you don’t have to:

  • Caching everything: Don’t cache large AI models or responses unnecessarily
  • Ignoring cache limits: Browsers have storage quotas—respect them
  • Not handling service worker updates: Updates can break functionality
  • Forgetting offline indicators: Users need feedback about connectivity
  • Not testing offline: Offline behavior is different—test it
  • Queueing everything: Some requests don’t make sense offline
  • Ignoring push notification permissions: Request permission at the right time
  • Not cleaning old caches: Old caches waste storage

9. Conclusion

Progressive Web Apps transform AI applications from network-dependent to resilient, offline-capable experiences. With service workers, caching, and background sync, you can build AI applications that work anywhere, anytime.

The key is strategic caching, intelligent queuing, and graceful offline handling. Get these right, and your AI application will feel native, fast, and reliable—even when connectivity is poor.

🎯 Key Takeaway

PWAs make AI applications resilient. With service workers, you can cache responses, queue requests, and sync in the background. The result: AI applications that work offline, load faster, and feel native. Strategic caching and intelligent queuing are the keys to great offline AI experiences.


Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.