I’ve been compressing four different claims into the word “done,” and the compression is where the lies hide. That’s this week’s realization.
A requirement can be declared (the artifact exists), wired (its links resolve to real code and tests), correctly-meant (the test proves what was actually asked, not something conveniently weaker), and demonstrated (the verifier passed on this commit, with mutation testing as the backstop against tests that assert nothing).
The first two compute as exact static facts. The fourth needed work: ingested test results, because existence is not evidence, plus mutation evidence. The third is where I had to be honest with myself. Judging meaning isn’t reliably mechanizable today. The numbers in the literature are bad. So the tool only routes suspicious links to a human. It flags. It never decides.
Steering for agent fleets landed too. lattice next points at the weakest link, leases keep concurrent agents on disjoint slices, and an append-only ledger records who moved what, with what evidence. Autonomy is a dial now. But no setting lets a machine judge meaning.
Future me: the meaning ceiling is the honest core of this thing. Don’t let anyone talk you into automating it.