I have a database containing information on every game (including every goal) in the NLL this year. I’ve been looking at goal totals, assist totals, and weird combinations like “which quarter has seen the most power play goals scored?” (Answer: the 4th overall, though Minnesota and Philly like the 2nd better). I then started to look at the attendance numbers to figure out which were the teams that fans in other cities wanted to see.
Background (you can skip this section if you’re not interested in where this database came from):
As part of my work writing the Money Ballers columns on ILIndoor.com, I wrote up a program (a python script, if you care) to download and read the game sheets for each game from NLL.com and calculate the Money Ballers points for each player. This was a lot faster and easier than the error-prone manual way I did it for the first week or two. I then realized that I could modify my program slightly to record a bunch of information about each game in a database. I’ve been working for Sybase, writing software for a SQL database engine, for almost fifteen years, so I kind of know what I’m doing there.
I changed my script to store information on each game (home/away teams, winner/loser, final score, attendance) and each goal (scorer, first and second assist, what quarter, what time within the quarter, and whether it was power play, shorthanded, game-tying, go-ahead, game-winning, empty net, or penalty shot). I will hopefully be adding other stats like shots on goal, loose balls, penalties, etc. in the future. This is all stored in a Sybase SQL Anywhere database, and I have stored procedures written in SQL to serve up web pages displaying and aggregating these stats in various ways. And no, these web pages are not available on the internet, only on my laptop. Sorry.
My page on attendance originally contained the following numbers for each team:
- Total attendance (Colorado is the only team over 100,000 this season)
- Average attendance (Colorado leading, 12,614)
- Home average (Buffalo leading, 15,318)
- # home games
- Away average (Colorado leading, 10,602)
- # away games
The away average seemed like the best way to figure out who the biggest draw was, but I quickly realized that this number was heavily skewed by the cities that a particular team had visited. For example, say you wanted to decide which would increase attendance more for any given team: hosting Toronto or hosting Rochester. If each has played two away games but Toronto has played in Colorado and Buffalo, while Rochester has played in Washington and Minnesota, then you can’t compare the averages. To compare them would be meaningless because Colorado and Buffalo get far higher attendance numbers than Washington or Minnesota regardless of who’s visiting. This means that the fact that Toronto’s away average is in the 15,000 area while Rochester’s is maybe 6,000 doesn’t tell us anything. We need a way to take away the differences in average attendance.
To to this, I took the attendance for a particular game and subtracted the average home attendance for the home team. For example, Calgary’s average home attendance so far this season is 8,122. When Colorado played in Calgary, the attendance for that game was 9,341.This means that all other things being equal, Colorado increased the attendance by 1,219. I call this difference the draw, so Colorado’s draw for that game was 1,219. If the attendance for a particular game is less than the average for that arena, then the draw for that game is negative. Doing this for all of Colorado’s away games and taking the average gives you the average draw. For Colorado this number is 920. This means that regardless of what arena you are talking about, having Colorado in town increases the attendance by about 920 people.
This scheme factors out the huge differences in attendance averages at the various arenas, making them more easily comparable. But they’re still not going to be 100% accurate, for a number of reasons. First off, we’re talking about a very small sample size – only a handful of games per team. Secondly, attendance can be affected by many other factors, including which night of the week the game is, the time of the game, the weather in the city at the time, how the team is doing (if the team is losing, some people may not bother going to the game regardless of who the visitor is), how the visiting team is doing (more people might have come out to see Washington last year than this year) and even how other sports teams in the city are doing (a Mammoth game in Denver on Super Bowl weekend will not draw the same if the Broncos are playing than it would if they are not).
There is no way to adjust the numbers to account for all of these factors, but here are the numbers for the draw for each team.
|Team||Away Games||Average Draw|
We can see that the Buffalo Bandits are the biggest draw in the league, bumping attendance by an average of 961 people per game. Colorado is next at 920. The Roughnecks and the Rock don’t make much of a difference, while having the Swarm in town reduces attendance by 935 per game. Again, because of the small sample sizes it’s hard to draw meaningful conclusions, so perhaps we’ll wait until the end of the season and see what the numbers look like then. I plan on expanding my database to include games and stats from previous seasons, though it looks like I won’t be able to get the detailed goal information since the game sheets don’t contain them. The attendance is there, so I may be able to get data for the past 2-3 (or even more!) seasons.
This article is the first in a short series about attendance. Next time: why my conclusion above is wrong and how to fix it.