Probabilistic Report Cards: LLM Evaluation Metrics

From N-Grams to LLM-as-a-Judge: A deep dive into the evolution of evaluation metrics.

January 21, 2026 · 8 min

Electronic Executives: RAG, ReAct and MCP

A deep dive into the cognitive architectures of modern AI agents, exploring Retrieval-Augmented Generation (RAG), the ReAct reasoning pattern, and the Model Context Protocol (MCP).

January 15, 2026 · 6 min

The Need For Speed: KV Cache and memory optimization at Inference

An introduction to KV Caching and its role in optimizing Transformer inference.

January 13, 2026 · 7 min

Crafty Patchwork: Parameter-Efficient Fine-Tuning

An introduction to Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA, QLoRA, and more.

January 12, 2026 · 7 min

Attention Is All You Need, but exactly which one?: MHA, GQA and MLA

We are about to touch the holy grail of modern AI. From the original 2017 paper to DeepSeek’s MLA, how has the definition of ‘Attention’ transformed?

January 10, 2026 · 7 min

Semantic Alchemy: Cracking Word2Vec with CBOW and Skip-Gram

Understanding the mathematics behind Word2Vec, CBOW, and Skip-Gram and how they map language to vector space.

January 7, 2026 · 12 min

The DNA of Language: A Deep Dive into LLM Tokenization concepts

A comprehensive guide to tokenization strategies: BPE, WordPiece, Unigram, and SentencePiece.

January 5, 2026 · 14 min