Writing
Blog
Long-form thinking on AI engineering, system design, and building products people actually use.
LLM Observability: Building Eval Pipelines That Actually Catch Problems
Logging prompts and responses is not observability. Here is how to build eval pipelines that surface hallucinations, semantic drift, and cost spikes before your users do.
Django vs FastAPI for AI Backends: A Decision Framework
After shipping AI products with both, here is the honest breakdown — when Django's batteries-included approach wins, when FastAPI's async-first design is the right call, and how to hybridize.
Vector Database Showdown: Pinecone vs Weaviate vs Chroma in 2025
Benchmarked all three in production RAG workloads. The winner depends entirely on your query patterns, budget, and ops maturity — not the benchmark charts.
Multi-Agent Systems in Production: LangGraph Patterns That Actually Work
State machines for LLMs are powerful and surprisingly tricky to operationalize. Graph patterns, error-recovery designs, and human-in-the-loop integrations that held up under real load.
Building RAG Pipelines at Scale: Lessons from Production
What nobody tells you about retrieval-augmented generation when you move from prototype to production: chunking strategies, re-ranking, eval loops, and the surprising cost of naive embeddings.
AI-Native Product Architecture: Beyond the ChatGPT Wrapper
A framework for building products where AI is the core, not a feature bolted on. LLM routing, fallback chains, observability, and cost control at scale.
Engineering Leadership in Remote-First Indian Startups
Nine years of hard-won lessons on building high-performing distributed teams — hiring for ownership, async-first culture, and why velocity is a lagging indicator.
LangChain vs LlamaIndex in 2025: A Pragmatic Comparison
After building production systems with both, here is where each framework genuinely shines — and where they will slow you down.
Kubernetes for ML Workloads: A Practical Playbook
GPU node pools, spot instance strategies, model serving with vLLM, and the autoscaling configuration that cut our inference costs by 65%.
The System Design Interview: What Interviewers Actually Want
After conducting 200+ mock interviews, I have noticed the same patterns. This is what separates strong candidates from exceptional ones.