Skip to content
AI

Inside a four-model agent economy built on small LLMs

Thousand Token Wood v2 is a multi-agent economy turned playable finance drama: you act as the Patron, lending, bribing, and shorting while woodland creatures scheme back. The twist is that every creature runs a different lab's small model—OpenAI gpt-oss-20b, OpenBMB MiniCPM3-4B, NVIDIA Nemotron-Mini-4B, and a fine-tuned Qwen 0.5B—creating genuine market heterogeneity. The author shares practical engineering lessons, from fixing vLLM serving for all four models with one CUDA image change to building a fault-tolerant JSON parser that makes adding a model a config entry. He also explains how to enforce information security via a prompt firewall and automated token scanning, and how bounded sentiment summaries give small-model agents persistent memory without context bloat.

Read full article →