IoT is not all about Cloud

Recent past, I had multiple discussions with many tech forums and many people have a misconception about IoT and Cloud. Some think whenever we do something like blinking an LED with Raspberry Pi or Arduino is IoT. I just thought of sharing some of my viewpoints on these terminologies. Internet of Things(IoT) – refers to […]

Read more →

LLM Latency Optimization: Techniques for Sub-Second Response Times

Introduction: LLM latency is the silent killer of user experience. Even the most accurate model becomes frustrating when users wait seconds for each response. The challenge is that LLM inference is inherently slow—autoregressive generation means each token depends on all previous tokens. This guide covers practical techniques for reducing perceived and actual latency: streaming responses […]

Read more →