Vectors & Verbsby Siddhesh Inamdar

Exploring the intersection of human creativity and machine intelligence.

Probabilistic Report Cards: LLM Evaluation Metrics

From N-Grams to LLM-as-a-Judge: A deep dive into the evolution of evaluation metrics.

January 21, 2026 · 8 min

USB-C of AI Space: The Model Context Protocol

MCP is the open standard for connecting AI models to data and tools. Discover how Anthropic’s new protocol solves the $N imes M$ integration problem, creating a plug-and-play ecosystem for AI agents.

January 15, 2026 · 7 min

Electronic Executives: RAG, ReAct and MCP

A deep dive into the cognitive architectures of modern AI agents, exploring Retrieval-Augmented Generation (RAG), the ReAct reasoning pattern, and the Model Context Protocol (MCP).

January 15, 2026 · 6 min

The Need For Speed: KV Cache and memory optimization at Inference

An introduction to KV Caching and its role in optimizing Transformer inference.

January 13, 2026 · 7 min

Crafty Patchwork: Parameter-Efficient Fine-Tuning

An introduction to Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA, QLoRA, and more.

January 12, 2026 · 7 min

Mission Impossible: Fitting Trillion-Parameter Giants into 80GB GPUs

An introduction to optimizations for Large Language Models, covering GPU utilization, precision control, and memory management.

January 11, 2026 · 8 min

Anatomy of Trillion-Parameter Switchboards: Understanding Feedforward Blocks

Exploring the hidden layers of trillion-parameter switchboards: Feedforward Neural Networks and Activation Functions.

January 10, 2026 · 5 min

Attention Is All You Need, but exactly which one?: MHA, GQA and MLA

We are about to touch the holy grail of modern AI. From the original 2017 paper to DeepSeek’s MLA, how has the definition of ‘Attention’ transformed?

January 10, 2026 · 7 min

The Geometry of Meaning: Sine, ALiBi, RoPE, and HoPE

From Sinusoidal to RoPE and HoPE: How Transformers learn to process word order and sequence length.

January 9, 2026 · 7 min

The Global Accountant and the Subword Surgeon: Decoding GloVe and FastText

How subword units and global co-occurrence matrices allow GloVe and FastText to capture nuances that Word2Vec missed.

January 8, 2026 · 5 min