Quantifying Variance is a new biweekly column in which we’ll take a look at some of the math underlying poker, with the goal of understanding just how probable or improbable various occurrences actually are, and how to tell the difference between what is random and what is not.
Variance is an unavoidable part of any game or sport, and overall a good thing: there is, after all, little point to a competitive activity when the winner is known in advance. Variance stems from two sources: from the game itself – let’s call that ludic variance – and from the players – we’ll call that agential variance.
To understand the difference, consider a zero-chance, perfect information game like chess. There is clearly no ludic variance in such a game – the same sequence of moves will always produce the same result. On the other hand, the unreliable performance of the human mind means that in practice, the weaker of two closely-ranked players will still win some of the time. Any given player may commit a horrible blunder at one moment and enjoy a stroke of brilliance the next, and these fluctuations in effective skill are unpredictable: this is what I mean by agential variance.
There’s a fundamental difference between the two forms of variance in that ludic variance is always short-term: although streaks can happen, the game is reset each time the cards are shuffled, and missing one flush draw or all-in race doesn’t make you any more or less likely to hit the next one. Conversely, agential variance can be short- or long-term. Sometimes we just make a silly mistake and get over it, or go with our gut and make a brilliant call. But a series of good or bad results – whether due to ludic or short-term agential variance – can affect one’s state of mind and subsequent play, leading to long-term agential variance which compounds the effects of the other two types.
Poker has all of these forms of variance, and in great abundance. There’s the deal of the cards, of course, but especially in No-Limit and Pot-Limit games, close decisions in large pots can often outweigh any profits or losses leading up to that point. On top of that, there’s the emotional aspect of the game: that is, whether one is on one’s A-game or tilting at any given moment.
Starting with heads-up
Teasing apart the different forms of variance can be difficult, especially in a game like poker where they all factor into almost every hand one way or another. In fact, it’s often hard enough just to separate the actual skill component of the game from the variance, let alone distinguish which forms of variance are at play. In trying to quantify these things, then, we want to look for the simplest possible thing to analyze as a starting point.
Most formats of poker offer a spectrum of payouts, whether that’s a continuous spectrum like in a cash game, or the more granular payout structure of a tournament. The exception is winner-take-all tournaments, and the most common form of those is the heads-up sit-and-go, in which one player doubles up (minus the rake) and the other player loses his full buy-in.
This binary payout structure is extremely convenient for us, as it means that there are only two types of result to consider, and thus only two types of streaks we’re looking for: winning streaks and losing streaks.
Streaks in coin flips
Stripping things all the way down, we can start by looking at a series of games played by two identical AI bots. Since the bots’ play is identical and they don’t experience the effects of human emotion, the results of this series of matches should be indistinguishable from a series of random coin flips. Since the bots are winning and losing with equal likelihood, we would expect that their winning and losing streaks should also be equal on average.
Furthermore, because the odds of any given streak decline exponentially with the length of the streak – the odds of winning or losing N consecutive coinflips are 1 in 2^N – we would expect that the relationship between average streak length and number of games played would be logarithmic. Sure enough, every tenfold increase in sample size means that the length of the longest expected winning (or losing) streak increases by about 3.3, as shown in the graph below.
A weighted coin
So, what happens when we have a winning player, instead? We’re still ignoring emotional effects, but let’s say we have two different AI bots, one of which is better than the other, such that it wins, say, 55% of the time. This is similar to having a weighted coin. Obviously, we would expect the winning streaks to get longer and the losing streaks to get shorter, but for the overall long-term trend to remain logarithmic.
On the graph here, the light and dark blue lines represent average winning streaks at a 55% and 60% win rate respectively, while the red lines represent the corresponding losing streaks. The pale grey is the baseline we established for 50/50 flips.
The effect is quite dramatic: at a 60% win rate, for instance, we would expect to see a streak of ten wins happen within the first few hundred games, but it would probably take several tens of thousands to see a similarly long losing streak.
But what about tilt?
Now that we know what to expect in terms of streaks when emotion is not a factor, it should be easy enough to tell how consistent or not a given player’s performance is. If their longest streaks are in keeping with statistical expectation, then we would say that they have their emotions well under control and are playing consistently. On the other hand, if either their longest winning or losing streak is significantly longer than expectation, and especially if both are, then that would tend to suggest that there is some form of tilt at play or, as we’re calling it, long-term agential variance.
Next week, we’ll take a look at modelling tilt more exactly, and see if we can find evidence for it “in the wild.” In the meantime, here is a table which gives the longest expected winning and losing streak for various win rates and sample sizes. If you’re a heads-up player and have access to your own stats, you can take a look at see just how much of a problem tilt is or isn’t for you. For instance, if you’ve played 1000 games and won 560 (56%), but have a winning streak of 13 and a losing streak of 11, chances are your emotions are affecting your play.
100 | 1000 | 10,000 | 100,000 | |
50% | 6 W/6 L | 9.5 W/9.5 L | 12.5 W/12.5 L | 16 W/16 L |
52% | 6.5 W/5.5 L | 10 W/9 L | 13.5 W/12 L | 17 W/15 L |
54% | 7 W/5.5 L | 10.5 W/8.5 L | 14 W/11.5 L | 18 W/14.5 L |
56% | 7 W/5 L | 11 W/8 L | 15 W/10.5 L | 19 W/13.5 L |
58% | 7.5 W/5 L | 11.5 W/7.5 L | 16 W/10 L | 20 W/13 L |
60% | 8 W/4.5 L | 12.5 W/7 L | 17 W/9.5 L | 21.5 W/12 L |
Notes: For win rates below 50%, the expected streaks are the same, but with wins and losses reversed. All values have been rounded off to the nearest half: in the case of a half number like 12.5, a streak of either 12 or 13 would be normal. In the case of a whole number, a streak one above or below the number given would likewise be well within expectation.
Alex Weldon (@benefactumgames) is a freelance writer, game designer and semipro poker player from Montreal, Quebec, Canada.