Shape of thought

The scarce thing in discovery is not a stronger generator or a sharper judge, but the structure that brings the right move within reach. The shape of the library in the head.

For a while I thought the obstacle was that the model is frozen, its parameters fixed the moment training ends, the way a mind that cannot learn cannot find anything new. But discovery does not happen in the synapses either. Mathematics grew for twenty-three centuries on hardware that did not change, in the layer of notation and proof outside any single head. So grant that the freezing is not the barrier, because what grows lives outside the model anyway. Then generation is not the problem either. A model generates freely, and given enough time a generator produces anything at all, the way the old room of typewriters eventually types out every book. Make the typist not random but superhumanly able, faster than we can picture, fluent in everything ever written. It still discovers nothing, and why it discovers nothing is the whole of what I want to work out.

The fast typist is the proof that sheer capability is not the scarce thing. It has unlimited capability, in the sense that matters here, and it produces only noise faster. Speed against an empty target is just more noise per second. So whatever is missing, it is not power, and a more powerful generator does not supply it. This is worth sitting with, because almost all of the public argument about these systems is an argument about power, about scale and how much the model can do, and the typist says plainly that you could have all the power there is and discover nothing with it.

The natural place to look next is judgment. If generation is cheap and most of it is noise, then what is scarce must be the thing that tells the good from the noise, the verifier, the judge that scores an output and keeps the ones worth keeping. This is not a straw position. It is very close to where the most serious current work sits, in reward models and graded rewards and the long search for signals a system can be trained against. And it is partly right. A discovery is not just a new thing, it is a new thing that turns out to hold, and holding is a verdict, something outside the move that rules on it. So judgment is real, and it is part of the answer, and for a while I took it for most of the answer.

It is not, and seeing why is the turn the rest of this rests on.

A perfect judge is useless on a space it cannot search. Put the best verifier you can imagine at the end of the line, one that never errs on whether a candidate is good, and then ask where the candidates come from. They come from the generator, sampling a space so large that the few worth judging never arrive. The judge sits with an empty in-tray, flawless and idle. The proposals it would have approved are out there, in a space whose sheer size is the actual difficulty, and nothing in the verifier reaches into that space to bring the right one close. So the scarcity was never at the end of the line, where the judging happens. It was at the start of the line, in what gets proposed at all. The proposal space is uncountably infinite, almost all of it noise, and the question that decides whether anything is ever found is which vanishingly small part of it you even look at. You can watch the field circle this without quite naming it. The current effort trains models against rewards that can be checked, and the unresolved argument about that work is exactly whether the checkable reward expands what a model can reach or only makes it sample more efficiently from what it could already reach. If the verifier were the scarce thing, adding it would plainly widen the reach. That it might only re-weight the reachable is the quiet admission that the reach was set somewhere else.

The clearest proof that this is the real difficulty, and not the judging, is the system everyone points to and, I think, mostly misreads. When a program first played Go better than the best people, the lesson most took was that the machine could now evaluate positions better than human intuition. But an evaluator, on its own, is helpless in Go, because the space is on the order of ten to the one hundred seventieth power in legal positions, more than the atoms in the visible universe, and a judge that scores any single position perfectly still cannot score them all. What made the thing work was not the evaluation. It was a network that, given a board, narrowed the move to the few worth considering, cutting the breadth of the search from the couple hundred moves a position allows to a handful, and a tree search that could then look hard down those few. The evaluator pruned the depth, the proposer pruned the breadth, and the proposer was the part that turned an unsearchable space into a searchable one. The famous move, the one in the second game against Lee Sedol that the commentators first took for a mistake, was a move human play rated at about one in ten thousand, and the point is not that the machine judged it good. The point is that the machine’s search could reach it at all, could hold a move that human priors had all but ruled out inside the small set it took seriously, and back it. Reaching it was the achievement. And the part of the story that tends to fall away is that this proposer was cut to Go, trained on Go, shaped to the one board, and it transfers to nothing else. There is no general version of it. The method that makes a space searchable has to be cut to the shape of the space.

So the question becomes what that narrowing is made of, the thing that decides which tiny part of an uncountable space is worth looking at. In Go it was a trained network. For thought in general it is something less obvious and, once you see it, hard to unsee. It is the structure of what you already know. A flat heap of facts gives no reach at all, because from any point every next step is as likely as any other and the space stays uncountable in every direction. But knowledge held in the right structure is not a heap. It fixes what counts as a move from where you stand, so that only a few next moves are worth trying and the rest fall away, and the search has somewhere to go. The arrangement of what you know is a prior over what to try next. How the knowledge is organized is not a matter of tidiness, and not a matter of how fast you can retrieve from it. It is the thing that decides what is reachable, which is to say what can be discovered from where you happen to stand.

What sets that prior is the relation that actually binds a domain, the thing that makes one piece lead to the next, and that relation changes from one domain to the next. In mathematics one result implies another, and the move worth trying is the one your current results let you prove. Implication runs in a single direction, each step resting on the one below it, and the structure that fits a relation like that is a tower. Taste has no implication anywhere in it. Nothing about having loved one spare and unhurried record obliges you to love the next, and what stands in for implication is bare position, one thing sitting beside another, so the move worth trying is whatever lies nearest to what you already love. That relation runs both ways and points nowhere in particular, and what it asks for is not a tower but a space of distances, with no up, no down, and nothing to climb. Other domains are bound other ways again. The events of a history run on causation laid out in time, where the move is to the thing the present makes likely, and the relation, like implication, runs forward, though through the world rather than through proof. A literature runs on reference, each work resting on the ones it draws from, a web of directed and uneven links where the move worth trying is back down into whatever the thing in front of you is standing on. The relations are as different as that, and a structure earns its keep only when it is built from the one its domain actually has.

You can feel why by handing a domain the wrong relation. Build taste as if it ran on implication, a tower of rules where loving this obliges you to love that, and the search hunts for what follows, and because nothing follows it invents rules that were never there while the thing that actually mattered, the one sitting quietly beside what you have, never makes the list. Build mathematics as if it ran on resemblance, where the next thing is whatever looks like the last, and the search turns up look-alikes that share a surface and prove nothing, while the result that genuinely follows sits far outside it, because it resembled none of what you held. The wrong relation does not slow the search down. It aims the search away from the move that mattered, which is the same as making that move undiscoverable, and no capability rescues it, because capability was never the scarce thing. The failure is quiet, too. The system still answers. It just answers along the wrong relation, felt only as a slow inability to get anywhere new.

Which puts the human discoverer in a different light than the one we usually use. We tend to credit the discoverer with sharper judgment or a faster mind, and to picture the work as the moment of selection, the recognizing of the good idea when it comes. But the rarer thing the discoverer has is not the judgment at the end. It is the structure in the head that put the good idea within a move in the first place, the years of a field loaded in until its true moves come first and its false leads fall away, so that the move that looks like a leap from outside was, from inside that structure, the short and almost obvious step into the next room. That structure, the one that makes the right proposal salient in a space where it should have been invisible, is most of what we have been calling taste. Taste is not a refined act of judgment. It is the map that brings the good move within reach. It is what the trained mathematician has that the fast typist does not, and it is not judgment and it is not power. It is the shape of the library in the head.

What I cannot settle is whether there is one such shape for thought or only many, each cut to its domain and transferring to nothing, the way the Go proposer transferred to nothing. If it is the second, the real work of discovery is not a stronger generator or a sharper judge but the finding of the right structure for a domain, the arrangement that puts its good moves within reach, and that work is slow and particular and mostly unbuilt, and whoever manages it for a domain holds the thing that makes the domain searchable at all. Under that sits a harder question, one I would rather not have reached. If taste is a structure loaded from a field rather than authored, then the discoverer who feels himself reaching freely into the unknown is walking the short local steps of a map he did not draw and cannot see. The rare move, the one that lands where the map’s own moves could not have led, is then either an illusion or the only real discovery there is, and from inside the map the two look exactly the same.