ADVERTISEMENT

Purdue vs. Michigan predictions...

thank you a LOT! Not sure how sigma came to be 11. Interesting that a normal curve was used with 11 being sigma. Don't know how this came to be .."adjusted efficiency margin for each team (AdjEM), and the adjusted tempo for each team (AdjT). AdjEM measures the point differential expected if a team were to play the "average" team over 100 possessions. Adjusted tempo is the average number of possessions by a team over a 40 minute game. I then calculate the point differential expected when team A plays team B over 40 minutes. Perhaps if I go to Kenpom, it will tell me how they "adjust" since the tempo will be dicated by the better team that tehy are trying to determine. I always think this stuff is interesting. It also has an "average" team possessions and I assume this is the "aveage" for all D1 teams...which with a shot clock would be MUCH more accurate than before the clock. Lot of questions, but really do like this stuff and winning by 20 points would no doubt be less that 2 sigma since the loser is close to U but less than U and the winner is greater than U adn both in the heart of higher frequency. Thank you, but to understand it I need more time and curious how or what they use to adjust things. The shot clock does decrease the potential possessions differences between teams and makes this more usable today that when coaches could coach
I was an English Major -- I thought your "U" meant me!!!

For further edification, GOOGLE and Reddit are your friends.
 
I was an English Major -- I thought your "U" meant me!!!

For further edification, GOOGLE and Reddit are your friends.
at a quick glance I'm guessing the point differential divided by 11 (std dev) would tell me how much area under the curve is added to the winner. If a team were predicted to win by 2.4 then the probability would be half the curve (50%) AND the area under teh curve between from the mean to 2.4/11 below the mean. or .218 from the mean. .4168- .4129 (z scores from mean for .21 and .22 respectively.

https://www.google.com/imgres?imgur...ved=0ahUKEwjhwL-dkMzYAhUuRN8KHWceBhwQ9QEILTAA

imgres

therefore .218 is roughly .4136? .5 minus .4136=.0864 and .5 (z score area above the 2.4 ) plus ,0864 area between the 0 to 2.4 win suggest teh winner with a probability of .5864 or roughly 59% chance.

Course the precise calculation may have some vague data for input. Thanks Do Dah
 
at a quick glance I'm guessing the point differential divided by 11 (std dev) would tell me how much area under the curve is added to the winner. If a team were predicted to win by 2.4 then the probability would be half the curve (50%) AND the area under teh curve between from the mean to 2.4/11 below the mean. or .218 from the mean. .4168- .4129 (z scores from mean for .21 and .22 respectively.

https://www.google.com/imgres?imgurl=http://www.z-table.com/uploads/2/1/7/9/21795380/9340559_orig.png&imgrefurl=http://www.z-table.com/&h=781&w=759&tbnid=8aJk6JKItKWyZM:&tbnh=160&tbnw=155&usg=__4NJNLg5dz3Sfp9aenwlHaKBIh6M=&vet=10ahUKEwjhwL-dkMzYAhUuRN8KHWceBhwQ9QEILTAA..i&docid=oqkmqmffUgKw3M&sa=X&ved=0ahUKEwjhwL-dkMzYAhUuRN8KHWceBhwQ9QEILTAA

imgres

therefore .218 is roughly .4136? .5 minus .4136=.0864 and .5 (z score area above the 2.4 ) plus ,0864 area between the 0 to 2.4 win suggest teh winner with a probability of .5864 or roughly 59% chance.

Course the precise calculation may have some vague data for input. Thanks Do Dah
gotta admit the area under the curve is easy, but the assumptions for the point differential I need to read on...

What do we do I Purdue actually wins by 10?. The probability of winning by 10 was in reality 100% but the data doesn't support that. Does that mean the adjustments were faulty....yes! ;)
 
at a quick glance I'm guessing the point differential divided by 11 (std dev) would tell me how much area under the curve is added to the winner. If a team were predicted to win by 2.4 then the probability would be half the curve (50%) AND the area under teh curve between from the mean to 2.4/11 below the mean. or .218 from the mean. .4168- .4129 (z scores from mean for .21 and .22 respectively.

https://www.google.com/imgres?imgurl=http://www.z-table.com/uploads/2/1/7/9/21795380/9340559_orig.png&imgrefurl=http://www.z-table.com/&h=781&w=759&tbnid=8aJk6JKItKWyZM:&tbnh=160&tbnw=155&usg=__4NJNLg5dz3Sfp9aenwlHaKBIh6M=&vet=10ahUKEwjhwL-dkMzYAhUuRN8KHWceBhwQ9QEILTAA..i&docid=oqkmqmffUgKw3M&sa=X&ved=0ahUKEwjhwL-dkMzYAhUuRN8KHWceBhwQ9QEILTAA

imgres

therefore .218 is roughly .4136? .5 minus .4136=.0864 and .5 (z score area above the 2.4 ) plus ,0864 area between the 0 to 2.4 win suggest teh winner with a probability of .5864 or roughly 59% chance.

Course the precise calculation may have some vague data for input. Thanks Do Dah
I specialize in vague input data ... it's job security.
 
  • Like
Reactions: tjreese
I specialize in vague input data ... it's job security.
The probability percent is highly dependent on a std dev of 11. I don't know how that was derived but I imagine that 99.72% of all games are within 66 points. 95.5% of all games are within 44 points and 68.2 % are within 22 points. I'm wondering if 11 is reverified every year after rule changes and like shot clock and stuff and this years data is based upon last years spread data.

This part is easy...it is how the assumptions for adjustments is made that may be a bit confusing or troubling. Anyway, I really appreciate the info as I understand hwo Kenpom picks the probability. I'm not sure I reconcile that data as accurately as I can calculate... ;)
 
Winner winner!
Can't figure out this kenpom thing. It was a 100% win and less than 2.4 points and I was told that the winner would win by 15 points if only a 90% winner.

Texas, if yioyu are reading this...teh ref I talked about last year that I thought Purdue did well with ...the one that did football games as well....he made some correct but gutsy calls
 
This part is easy...it is how the assumptions for adjustments is made that may be a bit confusing or troubling. Anyway, I really appreciate the info as I understand hwo Kenpom picks the probability. I'm not sure I reconcile that data as accurately as I can calculate... ;)

The "adjustment" in things like adjusted D and O efficiency and adjusted tempo is adjusting each game for your opponent. Scoring 1.25 PPP against the best D in the country is way more impressive than against the worst D in the country. That's why each game is adjusted on both offense and defense for opponent faced. Playing a 80 possession game against the fastest team in the country is no big deal, but if you played Wisconsin and got 80 possessions that'd be a huge deal.

Also, the probabilities calculated off best fit from previous data and is very similar to what you can back calculate from Vegas point spreads and historical W/L outcomes.
 
I tend to agree with this. This will be our first road game in a month and we are coming off a not-so-stellar shooting game. I think Purdue ends up with 65 to 70 points at most. The question is, can Purdue hold Michigan to the 63.3 points that the Purdue defense is averaging.
I'm honestly not sure. My homeristic belief is that we will jump to a 10 to 12 point halftime lead (say 36-26), but Michigan will slowly chip away at it over the course of the second half and we win by about 4 with some clutch free-throw shooting down the stretch.
I didn't give an official prediction, but this was pretty close too.
 
The "adjustment" in things like adjusted D and O efficiency and adjusted tempo is adjusting each game for your opponent. Scoring 1.25 PPP against the best D in the country is way more impressive than against the worst D in the country. That's why each game is adjusted on both offense and defense for opponent faced. Playing a 80 possession game against the fastest team in the country is no big deal, but if you played Wisconsin and got 80 possessions that'd be a huge deal.

Also, the probabilities calculated off best fit from previous data and is very similar to what you can back calculate from Vegas point spreads and historical W/L outcomes.
my point is that you are using previous data from other teams that may not at all be reflective of reality. I need to really see the assumptions to understand all the errors. you see the best teams data may be from the worst teams played and the worst teams data may be from really good teams played...and RPI "IF" used is directionally okay, but lacks precision. Winning teams will control tempo more than losing teams. I have a lot of questions such as how was 11 determined the std dev? Was it from the previous year data...several years....changed as rules changed? How was 3.5 for the home team arrived? Is that number reasonable for all places?

I'm reminded of a duck hunter that shot at a duck with the first shot 3 feet behind and quickly shot again, but 3 feet in front. On average the duck was dead, but the average did not represent what happened. I like this stuff, but want to learn the "reasoning" behind it and the assumptions used. It may very well be the best thing there with all the inherent errors possible.
 
Both calls at end of game the refs got them correct......

And just by chance Las Vegas sporting books also win......if you took Michigan and the points you won.......

Crazy how they get them so right.

Crooked Ricky next at Minny......

Boiler Up!
 
From strictly a point differential standpoint, 76-78 (-2) is closer to 70-69 (+1) than 74-69 (+5). :D
 
my point is that you are using previous data from other teams that may not at all be reflective of reality. I need to really see the assumptions to understand all the errors. you see the best teams data may be from the worst teams played and the worst teams data may be from really good teams played...and RPI "IF" used is directionally okay, but lacks precision. Winning teams will control tempo more than losing teams. I have a lot of questions such as how was 11 determined the std dev? Was it from the previous year data...several years....changed as rules changed? How was 3.5 for the home team arrived? Is that number reasonable for all places?

I'm reminded of a duck hunter that shot at a duck with the first shot 3 feet behind and quickly shot again, but 3 feet in front. On average the duck was dead, but the average did not represent what happened. I like this stuff, but want to learn the "reasoning" behind it and the assumptions used. It may very well be the best thing there with all the inherent errors possible.

Your questions don’t detract from the validity of the process and outcome. Feel free to research it yourself.
 
Your questions don’t detract from the validity of the process and outcome. Feel free to research it yourself.
but it was 100% win instead of 60% win. What ball games have you seen with a 60% win? It may mathematically be calculated accurately, but there ARE assumptions that create those calculations. I know THAT...you may not? That is not to suggest there are better methods today...or later in the year...or next year...just that it is very precise based upon some vague data.

If you cannot tell me how things are derived, then you do NOT know...and are working on faith. I admit I do not know, but if I get time my understanding will be based upon the assumptions. Do conference games have a std of 11? Do Big conference games have a different std than the Big East or Big 12? Are all home courts the same...3.5? You're working on faith...and I through much experience in other areas of statistics know that assumptions are in play. Shoot create some model of variables and no repeated measurements...and then use all those variables and first order interactions in the model and your R**2 will be 100% as a predictor, but it is only because there are no error terms. There is error, but I don't know enough about Kenpom to know the assumptions and you don't care to learn. That is fine. It may be the best we have at this time, but it is NOT fool proof and wonder if it has ever adjusted without any real changes to the variables...other than an assumption?
 
The probability percent is highly dependent on a std dev of 11. I don't know how that was derived but I imagine that 99.72% of all games are within 66 points. 95.5% of all games are within 44 points and 68.2 % are within 22 points. I'm wondering if 11 is reverified every year after rule changes and like shot clock and stuff and this years data is based upon last years spread data.

This part is easy...it is how the assumptions for adjustments is made that may be a bit confusing or troubling. Anyway, I really appreciate the info as I understand hwo Kenpom picks the probability. I'm not sure I reconcile that data as accurately as I can calculate... ;)
I just randomly chose ten of tonights games and ran them through this equation. ALL TEN were within 1/2 point of the Vegas odds ....
 
I just randomly chose ten of tonights games and ran them through this equation. ALL TEN were within 1/2 point of the Vegas odds ....
I have been amazed in the past in how close some games end, but are nowhere near that spread until the end. When you say you ran through that equation...do you mean the scores ended within a 1/2 point of the predicted difference in scores because there is no 60% chance of winning supported by the result it is either a win or not.

If we are only comparing the predicted scoring difference and it is close...the std dev accuracy whether 11 or not is not in play I don't think. I think it only gets in play when a percent probability is calculated. Again, I don't know the assumptions but your random account of 10 samples is a decent measure of the true average although a "possible" sampling of those closer to the tail may be much more inaccurate.

If I allow you to roll two dice 10 times you may not roll a 12 in only 10 times since there is only a 1/36 chance of doing it each time. This topic is interesting, but like I said to the Michigan guy I do not know the assumptions to be smart enough to know what it is doing. If you never rolled a 12 in 10 tries does that mean it doesn't exist or sampling error is at play? One thing for sure...it sure makes more sense to be accurate today than years ago when the clock was not in effect.
 
ADVERTISEMENT
ADVERTISEMENT