Function Calling Deep Dive: Building LLM-Powered Tools and Agents

Introduction: Function calling transforms LLMs from text generators into action-taking agents. Instead of just describing what to do, the model can actually do it—query databases, call APIs, execute code, and interact with external systems. OpenAI’s function calling (now called “tools”) and similar features from Anthropic and others let you define available functions, and the model […]

Read more →

Structured Output from LLMs: JSON Mode, Function Calling, and Instructor

Introduction: Getting LLMs to return structured data instead of free-form text is essential for building reliable applications. Whether you need JSON for API responses, typed objects for downstream processing, or specific formats for data extraction, structured output techniques ensure consistency and parseability. This guide covers the major approaches: JSON mode, function calling, the Instructor library, […]

Read more →

Streaming LLM Responses: Building Real-Time AI Applications

Introduction: Waiting 10-30 seconds for an LLM response feels like an eternity. Streaming changes everything—users see tokens appear in real-time, creating the illusion of instant response even when generation takes just as long. Beyond UX, streaming enables early termination (stop generating when you have enough), progressive processing (start working with partial responses), and better error […]

Read more →

Structured Output from LLMs: JSON Mode, Function Calling, and Pydantic Patterns

Introduction: Getting reliable, structured data from LLMs is one of the most practical challenges in building AI applications. Whether you’re extracting entities from text, generating API parameters, or building data pipelines, you need JSON that actually parses and validates against your schema. This guide covers the evolution of structured output techniques—from prompt engineering hacks to […]

Read more →

GPT-4 Turbo and the OpenAI Assistants API: Building Production Conversational AI Systems

Introduction: OpenAI’s DevDay 2023 marked a pivotal moment in AI development with the announcement of GPT-4 Turbo and the Assistants API. These releases fundamentally changed how developers build AI-powered applications, offering 128K context windows, native JSON mode, improved function calling, and persistent conversation threads. After integrating these capabilities into production systems, I’ve found that the […]

Read more →

Document Processing with LLMs: Parsing, Chunking, and Extraction for Enterprise Applications

Introduction: Processing documents with LLMs unlocks powerful capabilities: extracting structured data from unstructured text, summarizing lengthy reports, answering questions about document content, and transforming documents between formats. However, effective document processing requires more than just sending text to an LLM—it demands careful parsing, intelligent chunking, and strategic prompting. This guide covers practical document processing patterns: […]

Read more →