Monday, May 23 at 9:47 PM ET
In fantasy baseball yesterday, my lineup went 6-for-40 (.150 avg) with nine strikeouts, just 13 total bases and one caught stealing (offensive Ks and net stolen bases are categories). Hiroki Kuroda was also my only starting pitcher going yesterday (ouch - 5 2/3, four earned runs, just three strikeouts and the loss). Despite starting amongst the leaders for much of the first few weeks, with yesterday's debacle, I'm currently in last place. However, my team does not have anyone in the lineup with a career OPS less than .800 (.859 is the team average) and only one pitcher on my current roster has a career whip above 1.3 (team average is 1.24 - all of my pitchers have a career K:BB greater than 2.0). Last year at this time, I was in last place with a similar roster. I finished third out of 14 teams.
For this week's blog, I had intended to publish an explanation of the "log5 normalization" techniques that are used in each of our simulation engines today, but, after an exceptionally, unprecedentedly (i.e. unprecedented for the history of this company in any sport) difficult week for the baseball picks, I should address that now and table the intro course to simulation and normalization.
Our picks may have actually had a worse day than my fantasy team yesterday. In fantasy, it's fairly easy to look at the roster and trust that the players will get better and healthier and the team will improve. It's much more difficult to do that with picks because of the uncertainty with and the necessity to trust and commit to information each day. Plus, there is a far more tangible impact on bankroll that everyone feels. Whether it's fantasy sports or sports wagering, it's always frustrating to lose. And while I trust that everything will right itself - that the first six-plus weeks of excellent information were not a mirage - there are a few specific things that we are addressing (contrary to the belief of some, especially since we are new to daily baseball, I am constantly grinding through as much research and testing as I can to provide the best product possible).
I wish we could just eliminate Interleague play from our records, but then of course, anyone who played our money-line picks this weekend probably wishes the same. Unfortunately, it does not work that way. Yesterday, our playable money-line plays went 3-9. All 12 of those picks were on National League teams. Yes, it's perfect feasible that could be a small sample-size fluke (we've had days of almost all home team picks, just to pick almost all road teams the next day), but I think we all know enough about the game to realize that doesn't feel right. In total, 27 of 36 our playable money-line Interleague plays (75%) were on the NL team. Either we know something the market does not know or it's the other way around. Based on our performance, the latter seems more likely.
Whenever players and teams that do not usually face each other meet there will always be a little more uncertainty and error to our numbers. What you have come to expect of us and what we have come to expect of ourselves is that we handle these situations better than the competition/market/public/books. I have a hard time saying that is the case in this situation. We were clearly surprised by the general dominance of the AL (don't let the 21-21 overall record fool you, the AL easily out-scored the NL) because it happened in a very unexpected way. Most think of the AL as the better hitting league (not just because of the DH, but obviously that inflates overall numbers), yet it was the AL pitchers that far out-shined their opponents.
On the season (even including this weekend), the AL has only out-scored the NL by .041 runs per 9 innings (a difference of less than 1% - though part of this, as I speculated last year has to be unseasonably cold and wet weather as well). Last year, the difference for the entire year was .122 runs per nine innings, which 3% difference. In 2009, the difference between scoring in the AL and NL was .28 runs per nine innings, a 6% difference. Scoring itself is down 11% in that short time span. It is down almost 15% in the last seven years, when the difference between AL and NL scoring was more than a third of a run per nine innings. New ballparks actually work against this theory - Minnesota and Washington are neutral, Yankee Stadium favors hitters and Citi Field and Busch Stadium favor pitchers.
As the power outage continues - homeruns, average and walks are down, while strikeouts are up - the rationale appears to be more related to the improvements of pitchers (and their propensity to throw cut fastballs and changeups) than hitters. When the National League, which scores the same per game as the American League despite effectively plays with eight hitters and an out instead of nine hitters plays Interleague games, an expectation would be that the NL should do well (much better than in the past) because its offense should get better in AL parks, while the AL lineups get worse in NL parks. Save for what we saw happen to Brad Penny and Jason Berken on Friday, that largely was not the case over the weekend.
James Shields, Tyler Chatwood, Rick Porcello, Matt Harrison and others turned in career-best performances for AL teams. Should we have seen that coming to that degree? No. Career best outings will rarely be the expectation. Have we re-evaluated the engine to better account for interleague data? Definitely. Just because, despite the extra hitter, there is even scoring between American and National Leagues does not mean that the National League has better hitters. It appears as though - as this weekend and our new research has shown - the American League has slightly better hitters and notably better pitchers.
It's a fascinating turn of events. One that unfortunately caught us off-guard and one that I would not expect to hamper us again. The next round of Interleague play starts in mid-June. We spent a lot of time beforehand making sure we were ready, but we weren't necessarily looking at the right things. This time, we're more confident that we will be ready.
Beating the Books
Clearly the goal of this industry is to always be as far out ahead of the books/public as possible with our manipulation and leveraging of data. In reality, that does not mean that the distance between our abilities and theirs is always the same. In fact, it's cyclical. As we have discussed at length during football and even earlier this baseball season, there is the nature of our numbers to do well early, slow down as the market catches up and the improve again as the market makes up its mind on teams and players and our information adapts. Given the difference between our great early season success and our recent poor performance, I think it is fair to assume that our manipulation of the weighting of data throughout the season so far was not ideal. In the same time that the books/public were catching up, we may have helped them to a degree. In fact, we may have had a better representation of teams before the season than we did after a month.
If I had it to do all over again, I probably wouldn't allow current season performance to be weighted nearly as heavily as it was for the latter half of the first month and the first half of this month (in a related matter it's eerie how closely my fantasy baseball performance over the last two seasons and our pick performance have been - I don't think that's just a coincidence). We spent quite a bit of time developing, evaluating and re-evaluating strategies looking for optimal data input weighting. We had it where wanted it two weeks ago when we had everything (money-lines, O/U, run-lines, normal+, "half-bets," etc.) healthily in the black, experienced a minor hiccup early last week when we tried to make some adjustments for totals given the nature of this pitching-dominant season, got everything to a point where I was as confident with our baseball product as ever (which I noted last week) and then we ran into Interleague play. Obviously, after any stretch like we are coming out of, our confidence will be shaken, but I think we need to resist the urge to change anything that is not related to Interleague play (which affected ML, O/U and RL plays because of the way in which we missed) and stick with what we were confident in a week ago... Hopefully, the picks and my fantasy team catch fire at the same time (I'm specifically talking to you Hanley Ramirez).
Even with the struggles of last week, these plays have still been profitable on the whole in baseball: Normal+ (all), Normal+ (ML), Normal+ (O/U), All Playable (O/U) and Half-bet+ (O/U). Fortunately, we have also still hit over 50% in every category. That's not the perfect benchmark in baseball (profitability is all that ultimately matters), yet 50%+, coupled with our strong early season success leads me to trust that this engine has a clue and, with the right data, is capable of big things.
There have been many questions about my reference to "half-bets" over the last couple of weeks. For now, clicking on "Calc" will identify the suggested play for the normal play. If the $50 default is not altered, anything greater than $25 would count as what we have noted as a "half-bet." The point to what I have tried to do there is find a 3-4 plays per day per type of play without exceeding about ten plays a day that are some of our most confident/strongest plays. We are working on ways to add this info to tables or info boxes and/or change the color-coding for this type of play so that it is easier to find. This likely will not be introduced though until our next sport (football) rolls out several changes.
We never remove past articles and we always publish our past content for all to see after the day has completed. However, in the process of some of the testing noted above, we accidentally overwrote article data from May 17 and May 18 and we can't really get that back in the same format. I have the data (not two of our better days). If you would like to see any of that information, please let me know. Sorry for any confusion or inconvenience.
Baseball Info Box
Speaking of baseball info, we added an information box (click on "More" in the ML table) that includes the starting pitchers used for that game.
As usual, if you have any of your own suggestions about how to improve the site, please do not hesitate to contact us at any time. We respond to every support contact as quickly as we can (usually within a few hours) and are very amenable to suggestions. I firmly believe that open communication with our customers and user feedback is the best way for us to grow and provide the types of products that will maximize the experience for all. Thank you in advance for your suggestions, comments and questions.