DK

Writing

Blog

Long-form thinking on AI engineering, system design, and building products people actually use.

LLMObservabilityProduction
Jan 15, 2026·10 min read

LLM Observability: Building Eval Pipelines That Actually Catch Problems

Logging prompts and responses is not observability. Here is how to build eval pipelines that surface hallucinations, semantic drift, and cost spikes before your users do.

Read
DjangoFastAPIArchitecture
Oct 22, 2025·8 min read

Django vs FastAPI for AI Backends: A Decision Framework

After shipping AI products with both, here is the honest breakdown — when Django's batteries-included approach wins, when FastAPI's async-first design is the right call, and how to hybridize.

Read
Vector DBRAGProduction
Aug 8, 2025·9 min read

Vector Database Showdown: Pinecone vs Weaviate vs Chroma in 2025

Benchmarked all three in production RAG workloads. The winner depends entirely on your query patterns, budget, and ops maturity — not the benchmark charts.

Read
Multi-AgentLangGraphLLM
May 19, 2025·12 min read

Multi-Agent Systems in Production: LangGraph Patterns That Actually Work

State machines for LLMs are powerful and surprisingly tricky to operationalize. Graph patterns, error-recovery designs, and human-in-the-loop integrations that held up under real load.

Read
RAGLLMProduction
Mar 28, 2025·11 min read

Building RAG Pipelines at Scale: Lessons from Production

What nobody tells you about retrieval-augmented generation when you move from prototype to production: chunking strategies, re-ranking, eval loops, and the surprising cost of naive embeddings.

Read
ArchitectureLLMProduct
Feb 14, 2025·9 min read

AI-Native Product Architecture: Beyond the ChatGPT Wrapper

A framework for building products where AI is the core, not a feature bolted on. LLM routing, fallback chains, observability, and cost control at scale.

Read
LeadershipCultureStartups
Jan 5, 2025·8 min read

Engineering Leadership in Remote-First Indian Startups

Nine years of hard-won lessons on building high-performing distributed teams — hiring for ownership, async-first culture, and why velocity is a lagging indicator.

Read
LangChainLlamaIndexRAG
Dec 12, 2024·7 min read

LangChain vs LlamaIndex in 2025: A Pragmatic Comparison

After building production systems with both, here is where each framework genuinely shines — and where they will slow you down.

Read
KubernetesMLOpsInfrastructure
Nov 3, 2024·13 min read

Kubernetes for ML Workloads: A Practical Playbook

GPU node pools, spot instance strategies, model serving with vLLM, and the autoscaling configuration that cut our inference costs by 65%.

Read
InterviewsSystem DesignCareer
Oct 15, 2024·10 min read

The System Design Interview: What Interviewers Actually Want

After conducting 200+ mock interviews, I have noticed the same patterns. This is what separates strong candidates from exceptional ones.

Read