Tournament Poker Bot — Research Notes

Methodological Note · Volume I, Issue 3

Multi-Table Tournament Automation: A Methodological Note on Mid-Stakes Cohorts

This document presents observations on bot behaviour in multi-table tournaments compiled over the 2022–2026 operational window. The sample is drawn from mid-stakes events ($5–$100 buy-in) on three principal platforms and should be interpreted within that scope. Conclusions are descriptive rather than prescriptive; readers attempting to generalise to high-stakes or heads-up SNG formats should do so with caution.

Introduction

Automated agent play in cash games has been documented extensively in prior literature1. The tournament setting, by contrast, has received considerably less empirical treatment. The structural asymmetries between cash and tournament play — non-linear chip-to-money mapping, stack-depth variance across a single session, escalating blind pressure, and rake-substitute payouts concentrated at the top of the field — produce a decision surface that off-the-shelf cash-game policies are poorly suited to traverse. This note collects field observations on how that mismatch manifests and where it does not.

Methodology

Logs were collected from 18,400 multi-table tournament entries across three platforms during the stated window. Identification of automated participants was triangulated from action-timing distributions, sit-out behaviour, post-flop tree symmetry, and post-event operator disclosure. The cohort considered here comprises 612 distinct accounts judged with high confidence to have been operated by software agents in at least 90% of their observed entries.

The buy-in band was restricted to $5–$100 to control for population skill heterogeneity at higher stakes, where field composition shifts substantially. Field sizes ranged from 27-entry single-table satellites to 4,000+ entry guaranteed events. Reported figures are aggregated across platforms unless otherwise noted; per-platform breakdowns are available on request.

Scope note Heads-up sit-and-go, knockout bounty formats, and progressive-knockout variants are excluded from the primary cohort. Their payout topology differs sufficiently to warrant separate treatment.

Findings

Three observations recur across the sample with sufficient consistency to merit presentation here. The first concerns timing-tell decay: as agent implementations matured across the window, action-latency distributions converged toward human envelopes, removing what had been the dominant detection signal in 2022. The second concerns aggregate ROI: agents in the mid-stakes band, on average, posted positive ROI in early-stage levels but gave back the majority of edge during late-stage play. The third concerns structural events — bubble, satellite payout cliff, final-table reshuffle — where agent decision quality degraded most sharply.

Early Mid Bubble ITM Final +8% 0 −12%
Figure 1. Approximate per-stage ROI deviation from break-even for the agent cohort. Positive in early levels; negative from bubble onward.

Timing-tell decay

Early-cohort agents (2022–early 2023) exhibited action latencies clustered at characteristic offsets — typically a flat distribution between 0.4 and 1.2 seconds with negligible variance against board complexity. Later cohorts (2024 onward) introduce board-conditioned jitter, intermittent long pauses on high-leverage decisions, and human-like fatigue drift across long sessions. The detection signal that had been reliable in earlier years is no longer present in the same form.

ROI inversion across stages

The cohort posted positive net ROI in early-stage play (deep stacks, low blind pressure, large effective implied odds). That edge inverted from the bubble onward. The aggregate effect is a positive contribution from early play offset, and in many tournaments more than offset, by negative contribution from late play.

Failure clusters at structural events

Decision quality degraded sharply at three specific points: the money bubble, the satellite seat-allocation boundary, and the final-table reshuffle. Each of these introduces a discontinuity in the payout-equity gradient that cash-derived heuristics handle poorly. The mechanisms are treated in the companion notes on ICM and bot decision policy and late-stage dynamics.

Discussion

The principal interpretive claim is modest. Tournament play is not simply cash play with a clock; the payout structure rewrites the objective function in ways that propagate through every late-stage decision. Agents trained or hand-tuned on cash-equivalent objectives will, in the present sample, recover their early-stage edge and then surrender it at predictable structural moments. Whether this gap closes in subsequent cohorts is an open empirical question.

A secondary observation: the convergence of timing distributions toward human envelopes means that behavioural detection in tournaments increasingly depends on aggregate decision-quality signatures rather than per-action latency. Detection methodology adapted accordingly during the latter half of the observation window2.

Request the full per-platform breakdown and the underlying cohort taxonomy.

Request a seat allocation

Limitations

The cohort is restricted to mid-stakes events on three platforms and to a four-year window. Generalisation beyond that scope is not warranted by the data presented. The classification of accounts as automated relies on a triangulation procedure with non-zero false-positive and false-negative rates; sensitivity analyses against alternative thresholds are reported in the supplementary log.

Notes

  1. Prior work on automated cash-game agents in the no-limit hold'em domain is summarised in the Computer Poker Research literature, 2015–2021.
  2. Detection methodology in the 2024–2026 sub-window relies on Kullback–Leibler divergence between agent and human empirical action-frequency distributions across canonical river spots.