Technology Engineering – Page 25 – C4: Container, Code, Cloud & Context

LLM Output Parsing: Extracting Structured Data from Free-Form Text

Posted on September 10, 2021 by Nithin Mohan TK 15 min read

Introduction: LLMs generate text, but applications need structured data—JSON objects, lists, specific formats. The gap between free-form text and usable data structures is where output parsing comes in. Naive approaches using regex or string splitting break constantly as models vary their output format. Robust parsing requires multiple strategies: format instructions that guide the model, extraction […]

Read more →

Prompt Compression: Fitting More Context into Your Token Budget

Posted on August 5, 2021 by Nithin Mohan TK 11 min read

Introduction: Context windows are precious real estate. Every token you spend on context is a token you can’t use for output or additional information. Long prompts hit token limits, increase latency, and cost more money. Prompt compression techniques help you fit more information into less space without losing the signal that matters. This guide covers […]

Read more →

Multi-Modal LLM Integration: Building Applications with Vision Capabilities

Posted on July 1, 2021 by Nithin Mohan TK 13 min read

Introduction: Modern LLMs understand more than text. GPT-4V, Claude 3, and Gemini can process images alongside text, enabling applications that reason across modalities. Building multi-modal applications requires handling image encoding, managing mixed-content prompts, and designing interactions that leverage visual understanding. This guide covers practical patterns for integrating vision capabilities: encoding images for API calls, building […]

Read more →

LLM Evaluation Metrics: Measuring Quality Beyond Human Intuition

Posted on June 1, 2021 by Nithin Mohan TK 14 min read

Introduction: How do you know if your LLM application is working well? Subjective assessment doesn’t scale, and traditional NLP metrics often miss what matters for generative AI. Effective evaluation requires multiple approaches: reference-based metrics that compare against gold standards, semantic similarity that measures meaning preservation, and LLM-as-judge techniques that leverage AI to assess AI. This […]

Read more →

Conversation History Management: Building Memory for Multi-Turn AI Applications

Posted on May 1, 2021 by Nithin Mohan TK 13 min read

Introduction: Chatbots and conversational AI need memory. Without conversation history, every message exists in isolation—the model can’t reference what was said before, follow up on previous topics, or maintain coherent multi-turn dialogues. But history management is tricky: context windows are limited, old messages may be irrelevant, and naive approaches quickly hit token limits. This guide […]

Read more →

Semantic Caching: Reducing LLM Costs with Meaning-Based Query Matching

Posted on April 1, 2021 by Nithin Mohan TK 13 min read

Introduction: LLM API calls are expensive and slow. When users ask similar questions, you’re paying for the same computation repeatedly. Traditional caching doesn’t help because queries are rarely identical—”What’s the weather?” and “Tell me the weather” are different strings but should return the same cached response. Semantic caching solves this by matching queries based on […]

Read more →

Searching in

Category: Technology Engineering

LLM Output Parsing: Extracting Structured Data from Free-Form Text

Prompt Compression: Fitting More Context into Your Token Budget

Multi-Modal LLM Integration: Building Applications with Vision Capabilities

LLM Evaluation Metrics: Measuring Quality Beyond Human Intuition

Conversation History Management: Building Memory for Multi-Turn AI Applications

Semantic Caching: Reducing LLM Costs with Meaning-Based Query Matching