π Change logs
2026-06
- Published: Extra: Model Distillation β pouring big-model behavior into smaller models
- Published: 13: AI Native Product Design β making probabilistic systems feel reliable
- Published: 12: LLM Engineering β KV cache, inference cost, and deployment systems
- Published: 11: Agents β from chatbots to task-execution systems
- Published: 10: Tool Use β from saying things to doing things
- Published: βUnderstanding LLMs from First Principlesβ 06: Scaling Laws and Emergence, 07: Inference and Generation, 08: The Nature of Hallucination, and 09: RAG
- Updated: gave the βUnderstanding LLMs from First Principlesβ series a full polish pass β tightened mechanism accuracy and technical details, standardized and completed section illustrations, and added series entrance cards to the home page
2026-05
- Published: 01: The First Principle of LLMs: Token Prediction
- Updated: 01: The First Principle of LLMs: Token Prediction β added 5 section illustrations
- Published: 02: Token and Embedding: How Language Becomes Numbers
- Updated: 02: Token and Embedding: How Language Becomes Numbers β added 5 section illustrations
- Published: 03: Transformer and Attention: How Models βSeeβ Context
- Updated: 03: Transformer and Attention: How Models βSeeβ Context β added 5 explanatory diagrams
- Published: 04: Language as Compression of the World: Why Prediction Can Become Intelligence
- Updated: 04: Language as Compression of the World: Why Prediction Can Become Intelligence β added 5 explanatory diagrams
- Published: 05: Pretraining, Fine-tuning, and Alignment: From Continuation Machine to Assistant β bundled with 6 explanatory diagrams
- Published: The Math Behind LLM Pricing 01: How Inference Actually Works
- Published: The Math Behind LLM Pricing 02: Writing Inference as Equations β bundled with an interactive T_compute / T_memory simulator
- Published: The Math Behind LLM Pricing 03: From Inference Latency to Inference Cost
- Published: The Math Behind LLM Pricing 04: Cracking Open the KV Cache
- Updated: The Math Behind LLM Pricing 03: From Inference Latency to Inference Cost β added a Cost / Token interactive simulator
- Published: The Math Behind LLM Pricing 05: From One GPU to a Cluster β Parallelism and Interconnect
- Updated: π Hello, World! β added a βRecently Publishedβ section listing the latest posts
2026-01
- Published: Vibe Coding
2025-12
- Published: Intent Recognition
2025-06
- Site upgraded to Nextra-4
- Updated: User Value
- Published: Case Study: Yuqueβs Consumer Product Line
2024-04
- Updated: User Value
- Updated: π RAG Intro
2024-03
- Published: RAG (Retrieval-Augmented Generation) Practice Sharing
- Published: Annotation Reply
2024-02
- i18n Support
- Published: Softwaer as a Service
- Published: User Value
- Published: The 7 Question of Product Design
- Structure Adjusted
- Updated siteβs domain: https://insights.kaho.ioΒ
- Published: Dictionary
- Published: Japan Journey Gallery
2024-01
- Site Up! π Hello, world!
- Published: π―π΅ Japan Journey
Last updated: