π Hello, World!
This is a casual blog about AI, product design, and travel.
- Built with nextraΒ
- Deployed via Vercel
- Hosted on GitHub
- On PC, site development is done with VSCode, and Markdown content is edited with Typora
- On iPad, Working Copy is used to connect with the GitHub repository for content management, and Tiao is used for Markdown editing
The solution is basically free (I bought a domain for a better experience), stable in service, and offers good access speed both domestically and internationally. Itβs enjoyable to read on both PC and mobile. With git, multi-device synchronization and version management are possible. Writing on iPad is also a pleasure. Overall, Iβve found a solution Iβm quite satisfied with.
This site will be updated from time to time with insights from a product managerβs work, tool/product experience sharing, travel stories, and more. I hope it can be helpful or inspiring to you β€οΈ
π§ Series Entrances
π Recently Published
- 12: LLM Engineering: KV Cache, Inference Cost, and Deployment Systems
A first-principles explanation of prefill, decode, KV cache, long context, batching, inference cost, observability, and deployment architecture for scalable LLM products.
- 11: Agents: From Chatbots to Task-Execution Systems
A first-principles explanation of agents through goals, state, planning, tool loops, memory, failure recovery, stop conditions, and engineering boundaries.
- 10: Tool Use: From Saying Things to Doing Things
A first-principles explanation of tool schemas, structured calls, execution loops, permission boundaries, read/write risk, and how Tool Use differs from RAG.
- 09: RAG: Attaching Traceable Knowledge to Models
A first-principles explanation of RAG, grep-style tool search, retrieval, reranking, context assembly, citations, and evidence-grounded generation.
- 08: The Nature of Hallucination: Why Models Confidently Make Things Up
Why LLMs hallucinate, from first principles: next-token prediction, lossy compression, knowledge boundaries, and the systems that keep fabrication in check.
- 07: Inference and Generation: Temperature, Context Window, and Token-by-Token Output
A first-principles explanation of LLM inference, sampling, temperature, top-p, context windows, stop conditions, and why models generate one token at a time.







