# Statistical Significance In Sports Betting

What does statistical significance mean from a punting perspective?

If you’ve been involved with the prediction of future events, whether that’s the stock market, gambling or even in science, you’ve probably heard the term ‘statistically significant’ bandied around.

You’ve probably even come across a statement along the lines of “those results really don’t mean anything because the sample size isn’t statistically significant.”

So what exactly does that statement mean and how can we use it to improve our punting?

## Statistically Significant

In general terms, statistically significant means that there is ‘probably’ a relationship between two things. More specifically it suggests that the relationship is probably true, with a certain level of confidence.

For example you could test the relationship between smokers and lung cancer. If there was in fact a statistically significant relationship, then it would suggest that the relationship is ‘probably true’, with a degree of confidence.

Academics like to use 95% confidence for instance. In this case it means that there is a 95% chance that there is a relationship between a smoker and lung cancer, with a 5% chance that the relationship is in fact false. They also look at p values and have other assessments, but that’s a topic for another day.

## Why Does This Matter?

Let’s say that I felt I could pick which AFL team was going to win on the weekend by determining how far they travelled the week before to get to the game. My hypothesis was that a team that didn’t have to travel very far would have a better chance of winning than one that had to travel interstate.

If I run my hypothesis on one round of footy and I manage to pick 8 out of 9 winners have I just got lucky or have I in fact found the holy grail of sports betting?

Most of us would suggest that it was nothing more than luck and you’d be correct in thinking that way. There’s simply not enough data points to come to a statistically significant conclusion that the hypothesis was in fact true. And you couldn’t say with 95% confidence that there was a relationship there, to use simple terms.

## Where Do We Get An Edge?

If we are betting on the line in AFL then theoretically there is a 50/50 chance of us being right. In sports betting we need to not only beat the odds, but we have to beat the bookies as well.

For example if we want to place a line bet then we might be getting odds of 1.90. So we’re in fact receiving \$0.90 on a \$1.00 bet for an event that has a 50/50 chance of occurring. Which is clearly a losing proposition.

So if we had a betting system we would need to be winning greater than 53% of the time just to break even. For us to be long term winners our system would probably have to be right 55% of the time at those type of odds. If we were right 60% of the time then we would have a pretty good system on our hands.

## Statistical Significance & Sample Size

Again generally speaking, the smaller the edge the larger the sample size you’re going to need to prove statistically that there is in fact and an edge at all.

Conversely, if the edge is large in terms of the percentage of winners, then you only require a smaller sample to determine how effective your betting system is.

Going back to our AFL line example, if our system of using distance travelled to pick winners was actually effective and it was able to come in at 60%. Based on this, using napkin math I would then want to see around about a season’s worth of bets for me to think there was a degree of statistical significance – around, say, 200 bets.

If our system was more along the lines of being right 55% of the time then we would in fact need multiple seasons and potentially 1000-2000 data points or more for me to believe it might be statistically significant.

Anything lower than that and it means that there is still a chance that the relationship is based on random noise.

Many average punters are stunned by how large some sample sizes need to be to be considered statistically significant.