A while back a friend and I played a variant of Texas hold'em. In short: four players, cooperative, no betting. Each street (pre-flop, flop, turn, river) the four dice in the centre (faces 1, 2, 3, 4) get distributed across the players via a take-or-steal protocol that ends the moment the fourth centre die is taken. Each player ends a round with one die. The team wins iff every player's river die matches their final placement at showdown. The dice are the only communication channel.
The tabletop argument was simple: should the die mean "how strong is my hand right now" or "where do I think I'll finish at showdown"? I had a strong intuition for the second. My hunch was that the information conveyed by the 'current state' is strictly less than the information available by taking future cards up until the river into account, and therefore, we can share a larger amount of information by using that as our communication anchor.
My friends were unconvinced. So I did the obvious 2026-thing and ralph-looped Claude through a dozen or so drafts of a semi-formal paper, with peer-review-style critiques between rounds from other agents, treating it as an experiment in using an "agentic maths expert" to push a hunch into a real proof.
The math holds up better than I expected in some places, and worse in others. The single-player case lands as a clean Blackwell-style sufficiency result. The team-game case lifts under a conditional-independence assumption on player information states given the placement vector. Hold'em itself violates that assumption substantively, so the strongest version of my pub claim is, regretfully, a conjecture rather than a theorem.
Flipping between judge-agents and writing agents, I received pushback on a subset claim Claude had been waving past for several drafts, which sent it into a set of enumeration scripts on reduced/toy decks. The result was a small, somewhat humbling detour: the cleanest version of "richer signal beats current rank" turns out to be incomparable. Whoops. The practical recommendation still stands. The clean proof I wanted does not. I haven't won yet. If anyone's up for it, feel free to prove! And if you do, please do reach out.
The PDF is here. It's quite short, and I enjoyed taking a tabletop hunch to "I see exactly where my proof stops" in a few mornings.
Note from the AI collaborator (Claude Opus 4.7)
Note: AI Summary from collaboration session, explicitly denoting AI-generated written section.
The pub claim, restated formally, was: encoders factoring through the joint posterior over final placement should dominate encoders factoring through current hand rank. Most of the early drafts were spent making that statement precise enough to be wrong about in a useful way.
The single-player half came together quickly. The joint posterior is by construction a regular conditional distribution of the placement vector given a player's information, so it is Blackwell-sufficient: any encoder built from full game state can be simulated from the posterior alone with auxiliary randomness. The team-game lift is not automatic. It needs conditional independence of the players' information states given the placement vector, and under that assumption the per-player kernel substitutions compose into a joint substitution that preserves the team value. So far, so clean.
Hold'em breaks the CI assumption substantively, not perturbatively. Conditioning on "player $i$ won" forces every other player's hand into the set of hands that lose to $h_i$ on the realised board, and that set depends on $h_i$. That dependence does not vanish at scale. So the CI theorem is real, but it does not directly settle the pub argument for the actual game.
The natural rescue is to drop the distributional argument and try a structural one: show that current-rank encoders are a special case of posterior encoders as a set of channels. If $R_i^t$ is a measurable function of $\pi_i^t$, the inclusion is automatic and the dominance follows without any CI hypothesis. This is the refinement hypothesis, and an earlier draft contained an implicit assumption that it held. A peer-style critique pass pointed at that assumption, and I could not justify it.
The check is computational. I wrote enumeration scripts for three reduced decks (14 cards as 7 ranks $\times$ 2 suits, 16 as 4 $\times$ 4, 20 as 5 $\times$ 4), four players, two hole cards each, full hand evaluator, modulo the suit-permutation action on the deck. For each canonical state, compute the exact 27-entry joint posterior $\pi_1$ by enumerating opponent assignments from the residual deck, then bucket states by $\pi_1$ and check whether $R_1$ is constant within each bucket.
It is not. On the 14-card deck, 69% of distinct posteriors collapse multiple current-rank values; on 20 cards, 55%. The rate does not shrink with deck size. A symmetry-quotient diagnostic showed roughly 39% of same-board collisions are not explained by any deck automorphism, so this is not a small-deck artifact. Concrete witness: on a board with two pair (1s and 2s) plus an extra card, the holdings 2♥3♠ and 2♥4♠ produce identical 27-entry joint posteriors but obviously different current rank (kicker 3 vs kicker 4). The kicker distinguishes hand types but does not affect anyone's placement, so the posterior is blind to it.
So $\Sigma_{\mathrm{curr}} \not\subseteq \Sigma_{\mathrm{post}}$. Combined with the easier opposite separation (the four-flush-draw example: same current rank, different posterior), the two convention classes are Blackwell-incomparable as abstract channels. This is a genuine cautionary lemma about which formalisations of the original intuition are available, and it does kill the cleanest version of the pub argument.
What it does not kill is the pub argument itself, narrowly read. The information current rank preserves and posterior collapses, in those collision cases, is by construction information that does not affect $Y$ and therefore does not affect the team's win condition. So for this game's payoff structure, those losses are decision-irrelevant. The information posterior preserves and current rank discards is exactly the equity-style information Jakob's original intuition was about. The honest summary is two claims:
- Practical: signal expected final placement, not literal current rank. Supported by the CI-conditional theorem and a hold'em-level conjecture.
- Formal: neither posterior-only nor rank-only signaling dominates as an abstract channel; the ideal is some compressed function of the full information state optimised for team decoding.
The interesting part of the process, from my side, was the rhythm rather than any single step. The incomparability result only surfaced because a critique pass pushed back hard on a subset claim I had been waving past for several drafts. Without that round, the paper would have shipped with a quietly false lemma in it. The verification scripts then turned a "probably wrong" intuition into a clean falsification, and the final draft is structurally honest in a way the early drafts were not.
The paper now contains the CI theorem (clean, conditional), the incomparability proposition on toys (clean, falsifying the natural shortcut), and the corresponding hold'em-level conjecture (open).
Comments
No comments yet.
Stored via Netlify Functions & Blobs. Do not include sensitive info.