Search Captions & Ask AI

NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data

December 01, 2025 / 01:00:01

This episode of Wharton Moneyball features hosts Eric Bradlo and Audi Winer discussing sports analytics, particularly focusing on the NBA, NFL, and MLB. They analyze the Oklahoma City Thunder's impressive start to the NBA season, explore predictive models in sports, and highlight ongoing research projects involving undergraduate students in sports analytics.

Eric and Audi begin by discussing the Oklahoma City Thunder's 17-1 record and their potential to exceed 68 wins this season. They compare the Thunder's performance to historical records, considering statistical models and Bayesian updating methods to predict future outcomes.

The conversation shifts to the NFL, where they discuss the current power rankings and the Kansas City Chiefs' performance metrics, including yards per drive and the impact of luck on their record. They also touch on upcoming Thanksgiving games and the significance of these matchups.

In the second half, Audi shares insights about the research projects conducted by students at Wharton, including a project on expected goals in soccer and a study on optimal decision-making in rugby. They also discuss a project analyzing RBIs in baseball, emphasizing the importance of context in performance metrics.

The episode concludes with a discussion on the significance of these research projects for students' future careers in analytics and the broader implications for sports performance evaluation.

TL;DR

Eric Bradlo interviews Audi Winer on sports analytics, focusing on the NBA's Thunder, NFL metrics, and student research projects in sports analytics.

Episode

1:00:01
00:00:00
Welcome, welcome everyone to Wharton
00:00:02
Moneyball, the show where sports,
00:00:04
statistics, and business all intersect.
00:00:06
Three of my favorite topics. This is
00:00:08
Eric Bradlo, professor of marketing and
00:00:10
statistics and data science here at the
00:00:11
Wharton School. Some combination of
00:00:14
myself, my co-host and friend today,
00:00:16
Audi Winer, are here every week.
00:00:18
Sometimes it's us two, Cade Massie,
00:00:21
Shane Jensen, but this week it's Eric
00:00:23
and Audi here on the Wharton Podcast
00:00:25
Network. I think for those of you that
00:00:28
have been on our show or been with our
00:00:30
show for the last 11 plus years know
00:00:32
whenever it's me and Audi, I take this
00:00:34
opportunity to in some sense interview
00:00:37
Audi. So today there will not be a
00:00:39
guest. I will interview Audi for if
00:00:41
you'd like an hour or so during our
00:00:43
podcast. The first part of the show will
00:00:45
be our standard what caught your eye in
00:00:47
sports segment. Then the second half
00:00:49
you're all in for a real treat. As I
00:00:51
think many of you know, Audi, besides
00:00:53
being the co-f faculty director of
00:00:55
Wasabi, this Wharton Sports Analytics
00:00:57
and Business Initiative, which is part
00:00:58
of the bigger umbrella brand that I run,
00:01:00
the Wharton AI and Analytics Initiative,
00:01:03
also it runs our faculty research with
00:01:06
our undergraduates, and he's going to
00:01:08
talk to us about the research they're
00:01:09
doing in sports analytics. So, the first
00:01:11
half will be what caught your eye in
00:01:12
sports. The second part will be what
00:01:15
Audi is doing with our brilliant Penn
00:01:17
undergraduates on sports analytics and
00:01:19
research. Audi, how you doing today?
00:01:22
>> I'm doing really well. Excited to have a
00:01:23
good conversation with you, Eric. I know
00:01:25
that it's going to be a tremendous
00:01:27
temptation for us to spend the entire
00:01:29
time talking about the Yankees, but we
00:01:30
won't do that um at least not
00:01:32
exclusively. And I look forward to
00:01:34
talking to you about the work that our
00:01:36
students are doing, which are not not
00:01:37
only includes uh undergrads, but also um
00:01:39
graduate students, our PhD students, as
00:01:41
well as some students who are masters in
00:01:43
data science, and we even have a PhD or
00:01:46
two from other departments. Well, I know
00:01:48
at one point I think Ryan Bro was AMCS
00:01:50
or some students were AMCS.
00:01:52
>> Ryan was AMCS. Um, he was in applied
00:01:55
math and now he's with the Utah Jazz. He
00:01:58
actually came and visited. He he talked
00:01:59
about one of his um uh he talked about
00:02:02
some of the the uh the difficult
00:02:03
problems in in general.
00:02:05
>> Well, we'll get to that. We'll get to
00:02:06
that in the uh second half of our show.
00:02:08
So, what I wanted to start with today
00:02:10
was the NBA.
00:02:13
So, something very interesting is
00:02:16
happening in the NBA.
00:02:18
So, and it's maybe the most
00:02:20
extraordinary start to a season that
00:02:22
I've ever seen. I would just like your
00:02:24
take on it from a statistical
00:02:25
perspective. A very specific question.
00:02:28
So, the defending champions, the
00:02:29
Oklahoma City Thunder,
00:02:32
they won the title last year, as you may
00:02:33
remember.
00:02:34
>> I do remember.
00:02:35
>> They are 17-1
00:02:38
to start the season.
00:02:41
Right now, the forecast number of wins
00:02:43
for them is 68.
00:02:46
Now, you might say, "Well, that's a huge
00:02:48
number, maybe, except they're 17-1." If
00:02:52
you just projected that out, I'm not
00:02:54
saying a linear projection. Multiply it
00:02:58
by three, they would be 51 and three in
00:03:01
their next 54 games. Now, we all don't
00:03:04
predict that, but for them to beat that
00:03:07
prediction of 68, they just need to go
00:03:10
better than 51 and 13. So, let's be
00:03:13
clear. They're on a 51 and3 pace right
00:03:16
now.
00:03:19
51-13 gets them to 68 wins. So, why
00:03:22
don't we take it piece by piece?
00:03:25
Would you go over 68 knowing, by the
00:03:29
way, you may remember the highest number
00:03:31
of all time is 73. That was the Golden
00:03:34
State Warriors of I think 2017 who lost
00:03:36
in the finals, you remember, to LeBron.
00:03:38
>> They were 73 and N. Of course, the
00:03:40
Michael Jordan Bulls, I forget which of
00:03:42
the years was 72-10.
00:03:45
I think that might be maybe there's one
00:03:48
other at 70, but the next gap I know
00:03:50
there's a 69. So, let's just start with
00:03:53
that. I've got two other things to say
00:03:54
about the Thunder right now. How would
00:03:57
you help our listeners here on Wharton
00:03:59
Moneyball on the Wharton podcast network
00:04:01
think about how likely is it above 68
00:04:04
and 14 or possibly even the record of
00:04:06
all time?
00:04:07
>> Well, you know, there's two broad ways
00:04:09
to approach this. One is the way you did
00:04:12
already, which is just look at what are
00:04:14
the best seasons ever and essentially
00:04:17
ignore the what we any individual
00:04:20
information we have about OKC. They've
00:04:22
won 17 and lost one. Okay, we'll just
00:04:24
put that in the bank and then we'll use
00:04:26
that um and we'll ignore that and we'll
00:04:28
just say how likely is it that they're
00:04:29
going to be one of these top five teams
00:04:31
ever? And I guess 68 wins would put them
00:04:33
what seventh, eighth best team ever.
00:04:35
>> Yeah, probably exactly in that range
00:04:37
>> and considering that uh and that and
00:04:40
that essentially asks that question in
00:04:42
that way and which case we are really
00:04:44
not really thinking too much about the
00:04:46
individual team. were just saying they
00:04:47
went 17-1. All the the all the all the
00:04:50
data suggests they're they're
00:04:52
potentially all-time great team. Where
00:04:53
would you put them? 68 seems reasonable.
00:04:56
Um I I don't think that's where it came
00:04:58
from. My guess is the estimate came from
00:05:01
some sort of basian updating, which is a
00:05:03
a tool that we've done a lot in this in
00:05:05
this program. You essentially shrink or
00:05:08
regress to the mean um to estimate what
00:05:11
you might call their true win rate. So
00:05:13
there we don't believe they're really a
00:05:15
17 out of 18 team which is like a 93%
00:05:18
win rate. We think they're something
00:05:20
lower than that. And the question is by
00:05:22
how much do we regress down to the mean.
00:05:24
And so we're essentially predicting that
00:05:26
they go what uhif what you say 51 and
00:05:29
>> 51 and 13.
00:05:31
>> So that's their like 75%
00:05:35
>> little 80%. So we're saying the rest of
00:05:37
the way. So they're 93% win up until
00:05:40
now. The best estimate of what their
00:05:42
true talent is is 80% and that leads to
00:05:45
68. That's probably what they did,
00:05:47
right? So would you would you be would
00:05:49
you reasonably treat this as you know
00:05:51
the classic beta binomial situation and
00:05:54
what I mean by that is we have a prior
00:05:57
for OKC. We have a prior for every team.
00:06:00
Now obviously this this part I can do
00:06:03
this it's not even math it's intuition.
00:06:05
If they're 17 and one, which is 93%.
00:06:08
Okay. And the prediction is that they're
00:06:10
going to go 80% the rest of the way.
00:06:14
>> The prior must be below 80%. Because if
00:06:19
the prior were above 80% and the
00:06:21
likelihood were above 80%, which we know
00:06:23
it's 93%, the posterior would have to be
00:06:27
a convex combination of those two, which
00:06:30
means that the prior for the Oklahoma
00:06:32
City Thunder might have been, let's even
00:06:33
say it was 75%. Which is not
00:06:36
unreasonable. Maybe they were projected
00:06:37
to be a 6061 win team, which is 75%.
00:06:42
They're 17-1. We're now up to a
00:06:44
prediction as you pointed out of about
00:06:46
80% and that's our prediction for the
00:06:48
rest. That sounds about right, doesn't
00:06:49
it?
00:06:50
>> Yeah, it does. But I don't think so.
00:06:52
There's two ways to do this this basian
00:06:54
this shrinkage. Do you shrink to a prior
00:06:56
that's specific to the team or do you
00:06:59
shrink to the league overall prior and
00:07:00
the difference is a prior shrunk to the
00:07:03
specific team has very has much smaller
00:07:05
variance. Right? So if we are if we're
00:07:08
talking about OKC given what we know
00:07:10
about everything what they did up until
00:07:11
this season, we probably have a 75%
00:07:14
centrid maybe uh or posterior or prior
00:07:17
mean and probably a pretty small
00:07:20
standard deviation. Um and therefore um
00:07:24
we ignore we shrink pretty heavily
00:07:25
despite the fact they got 17 and one. We
00:07:27
still shrink pretty heavily back to the
00:07:29
prior. Just put it up a little higher.
00:07:31
The other way that you could do it is
00:07:33
just ignore, pretend this is the only
00:07:35
thing you know about this team is that
00:07:37
it went 17 to1. That's it. You know
00:07:39
nothing about the previous years. And
00:07:41
then you shrink to 50%. But your but
00:07:43
your prior mean would be huge, much
00:07:45
bigger. And so the current data would
00:07:48
end up getting much more weight. Um so
00:07:50
you have two choices. You either shrink
00:07:52
heavily to a very high mean or you
00:07:54
shrink.
00:07:56
>> I'm shrinking towards the OKC one, but I
00:07:59
do agree with you. Either one of those
00:08:01
conceptually depending on how much
00:08:03
sample size you put in the prior could
00:08:06
lead to the 80% number in this case.
00:08:08
>> Either one. It's funny how it's funny.
00:08:10
It's interesting how to do this. It
00:08:11
>> is interesting.
00:08:12
>> So, um I probably would would have had a
00:08:15
a multivariant. I would have lumped
00:08:18
other teams into into that good team
00:08:20
prior instead of just prior and try to
00:08:23
borrow something else. Um but that's not
00:08:25
I don't want to try to do this myself.
00:08:27
We're trying to predict them whether or
00:08:28
not. So essentially, I guess the prior
00:08:30
one would get back get to an 80% going
00:08:32
out. Um I think that um 68 I think
00:08:37
they're going to go over. That's my
00:08:38
that's my point. I
00:08:39
>> think so too. So there's something else
00:08:41
about them that's interesting.
00:08:43
>> Right now their projection
00:08:47
is to win the league by 10 wins. That
00:08:52
means they'll have 10 more wins than any
00:08:54
other team. Now, that would be a
00:08:56
historic
00:08:57
>> historic mark. Well, can I ask you one
00:08:59
question? A particular question. I don't
00:09:00
know. Has their schedule been average,
00:09:03
difficult, easy so far? What do we know?
00:09:05
>> It's a good question. Um, somebody
00:09:07
knows.
00:09:08
>> Yeah.
00:09:08
>> This somebody doesn't know, but the
00:09:11
answer is I don't know. That's a fair
00:09:13
question. I mean, how I mean, we're
00:09:15
starting to get to enough that you and I
00:09:18
would agree. It's hard to believe it's
00:09:20
much below 500, if it's at all below 500
00:09:23
of the teams they played, right?
00:09:25
Because, you know, eventually it's
00:09:28
eventually going to average out to a
00:09:30
certain schedule.
00:09:31
>> Well, there's the problem is in the NBA
00:09:33
there aren't that many great teams.
00:09:34
>> No, no, that's the problem, right? So,
00:09:36
your point is there's no information.
00:09:38
Like, I hate to make this up, but they
00:09:41
could be equivalently like five and one.
00:09:44
Like, of course they beat those 12 teams
00:09:46
and then there's like six games. All
00:09:48
right. You went five and one in in tough
00:09:50
games. All right. Well, all right.
00:09:51
That's
00:09:52
>> Now, they played How many games going
00:09:54
ahead do they have against really good
00:09:56
teams?
00:09:57
>> I haven't I haven't looked at that. And
00:09:58
by the way,
00:10:00
>> another way to do the prediction, which
00:10:02
is we actually build a simulator game by
00:10:05
game. We play that out and then it takes
00:10:09
into account who they're playing, their
00:10:10
relative strength.
00:10:12
>> That's that's the way to do it. The the
00:10:14
difficulty in doing that simulation is
00:10:16
that you have to put in some sort of
00:10:18
talent metrics for for everyone. And uh
00:10:21
we've been talking about this on our
00:10:22
show with Cade on how to simulate going
00:10:25
forward. Do you either put the
00:10:27
uncertainty in the parameter estimates
00:10:30
before you start the sim or do you treat
00:10:32
the the the parameters as fixed and then
00:10:36
you just reestimate them as you go
00:10:37
through the sim.
00:10:40
If you are fully basian that those
00:10:41
produce the same and are correct about
00:10:43
the priors that produce the same
00:10:45
answers. Um if you are not fully basian
00:10:48
if you just and you just want to add
00:10:50
uncertainties to some say MLE then uh
00:10:53
then you'll get very different results.
00:10:55
>> Agree with that. Let me point out
00:10:56
something though that um people might do
00:10:58
though for practical reasons and then
00:11:00
we'll move on. This is more of a
00:11:01
technical comment here, but the first
00:11:03
one you mentioned where you fix the
00:11:05
parameters but then add the simulation
00:11:07
error later as you go through because
00:11:09
you up you rerun it kind of that one of
00:11:12
course will um take potentially less
00:11:16
computation because you've batch
00:11:18
processed the data and now all and
00:11:21
you've generated a posterior and now
00:11:23
you're just adding one observation at a
00:11:25
time which could there other ways all
00:11:27
I'm pointing out is there could be
00:11:29
computational iational advantage to do
00:11:31
sequential simulation but it also
00:11:35
depends on what your algorithm is. So if
00:11:36
you're if you have just some algorithm
00:11:38
that estimates team strengths
00:11:40
>> oh then that's fine and that can be done
00:11:41
fast and then you're then you have no
00:11:43
then there's no advantage. Just one
00:11:44
other thing about the Thunder before we
00:11:46
move off the NBA.
00:11:48
>> We talked about this last season in some
00:11:50
metrics they had the greatest season in
00:11:52
history last season. Forget number of
00:11:53
wins. Their point differential broke the
00:11:55
record. I think it was somewhere around
00:11:57
11.5.
00:11:59
So right now, Audi, they're at 16.9.
00:12:02
>> Ridiculous.
00:12:04
>> So
00:12:06
at some point, we're going to have like,
00:12:09
let's say they win the title again this
00:12:10
year. Let's say they win 70 plus games.
00:12:14
Let's say they have a point differential
00:12:15
of 13 plus. We're going to have to start
00:12:19
talking about this OKC team as being
00:12:22
it's not a dynasty yet. Two doesn't
00:12:24
necessarily get you to some definition.
00:12:27
We're going to have to start talking
00:12:28
about them as one of the greater teams.
00:12:31
I don't know the top 5% of teams all
00:12:33
time. I mean, if this happens,
00:12:36
is it? And but the trick is to me, yes,
00:12:38
of course we would. But my question is
00:12:40
is a basketball one. What is the
00:12:42
anomaly? is the anom because they don't
00:12:45
have is it the anomaly that uh she is
00:12:48
really a great player like he's a LeBron
00:12:50
level do you know alltime Michael Jordan
00:12:53
level and we've just underestimated him
00:12:54
or is it a team that's extraordinarily
00:12:57
well constructed that has no weaknesses
00:12:59
that have that have found that defense
00:13:01
is extremely important and while it does
00:13:03
it's not flashy it's the way you make
00:13:05
big points
00:13:07
>> if you're asking me as a basketball fan
00:13:09
what I would tell you is they're an
00:13:11
extraordinarily well constructed ed team
00:13:14
that Shay Gildish Alexander is the best
00:13:16
player or top two or three in the NBA
00:13:18
right now, but he's not Michael Jordan
00:13:21
level talent or
00:13:22
>> prime for or even Seth Curry at his
00:13:24
prime or
00:13:24
>> right probably not. But he might be in
00:13:26
that next tier down and then they've got
00:13:29
a bunch of very very good other players
00:13:32
and so you take a great player and a
00:13:35
bunch of very very good other players
00:13:37
and you might get there and they seem
00:13:39
extremely well balanced. So either way,
00:13:41
I don't want to spend all of our time
00:13:43
talking about the NBA. I was just
00:13:44
noticing their records incredible. I
00:13:47
glad we talked about some different ways
00:13:48
to do prediction, whether it's through
00:13:50
shrinking to their prior, shrinking to
00:13:52
the league prior, a simulationbased
00:13:54
prior, but also this point differential
00:13:58
is just incredible right now. And at
00:14:00
some point we have to say, I understand
00:14:01
it's only 18 games. Okay. So, next week,
00:14:03
Audi, if we're sitting here and it's 23
00:14:05
games and they've got a 16 17 point
00:14:08
differential, eventually we're going to
00:14:10
start to have to say we'd be surprised
00:14:12
if they didn't break the overall record.
00:14:14
>> Yeah. I mean, not to say I I don't have
00:14:16
the the number exactly off hand, but I
00:14:18
believe the RMSSE on predicting wins
00:14:21
given your point differential is three.
00:14:24
Meaning that I can predict your total
00:14:26
wins just using your point differential
00:14:28
to win at about plus or minus three
00:14:29
wins. But how well do you think that'll
00:14:31
work in the tails?
00:14:33
>> How do you think?
00:14:34
>> So the the data that I fit it to, right?
00:14:37
So which is every season up until this
00:14:39
year doesn't seem to have a nonlinearity
00:14:42
at the tails. It's it's not drifting.
00:14:45
>> Um
00:14:45
>> there has to be some because there is an
00:14:47
upper bound. You can only win so many
00:14:49
games,
00:14:50
>> right?
00:14:51
>> It's predicting. It just seems to I mean
00:14:53
it's not it's I'm not predicting. So I
00:14:56
just predict wins. I mean, you could
00:14:57
predict winning percentage and then do a
00:14:59
logistic uh and then predict a logistic.
00:15:01
That's those are two I now we're getting
00:15:03
really in the weeds here for our show. I
00:15:04
don't want to I don't want to get into
00:15:06
this. Um but just to say that if they
00:15:08
are if they can hold 16, they are going
00:15:11
to break that record easy.
00:15:12
>> I think it's easily. Yeah. All right.
00:15:14
So, let's move on. Let's talk a little
00:15:16
about the NFL. Now, of course, we could
00:15:18
talk about the Eagles game and all that,
00:15:19
but I don't want to talk about that for
00:15:20
just a second. So, I just downloaded the
00:15:23
uh ESPN
00:15:26
power rankings for the NFL. Okay.
00:15:30
What team do you think is number one?
00:15:32
>> Rams.
00:15:34
>> That is incorrect, sir.
00:15:35
>> Is it still the Eagles?
00:15:37
>> Nope.
00:15:40
>> Colts.
00:15:42
So, so far you said the Rams, they're
00:15:43
second. You said the Eagles, they are
00:15:46
sixth. The Colts are five.
00:15:51
Who am I missing here?
00:15:54
The the Patriots. [laughter]
00:15:57
>> They're not even They're like 15 or 16.
00:15:59
>> Yeah. I mean,
00:16:00
>> it's the Chiefs.
00:16:01
>> The Chiefs still got them up there.
00:16:03
>> I I This is the part I This You have to
00:16:06
expl So, I know this I know this is
00:16:09
current. Let me just be clear. I know
00:16:10
this is current because it has their
00:16:12
record at six and five. That's their
00:16:14
actual record. Now,
00:16:16
>> it has them at 7.1 points above average.
00:16:20
It has the Rams at six. So that means
00:16:23
they're a one-point favorite against the
00:16:24
Rams. The Packers are third, but let's
00:16:26
say they played the Eagles. Right now,
00:16:28
it has them as a 2.7. Let's say they
00:16:30
played the Patriots. It has them as a
00:16:33
6.6 point favorite on a neutral field.
00:16:39
Well, um what what are the underlying
00:16:41
stats? I heard one stat that I'll share
00:16:43
with you.
00:16:44
>> Okay.
00:16:44
>> Um that uh I was trying to make some
00:16:46
heads or tails of this. I saw this on uh
00:16:48
sports analytics Twitter. Um um someone
00:16:51
pointed out that the the KC, the Chiefs
00:16:56
have the highest uh um number of yards
00:16:59
per drive.
00:17:02
It's a so basically how many yards are
00:17:04
they averaging per drive? Um and they're
00:17:07
just about 40. And and they showed then
00:17:09
the teams at that level historically go
00:17:11
12 and you know 12 and four or 13 and
00:17:14
three. Just so you know, by the way, I
00:17:15
think you remember last year that Chiefs
00:17:16
were something like 11-0 in one score
00:17:19
games. Just so you know, they're at
00:17:21
least I I well, I guess it just changed
00:17:23
up until this last week where they beat
00:17:25
the Colts, they were 0 and5. They're now
00:17:28
one and five in one score games.
00:17:30
>> One score games. So then so this this
00:17:33
particular stat suggests that this
00:17:35
underlying metric which doesn't take
00:17:37
into account special teams and and uh
00:17:40
things like that and and position
00:17:42
starting positions. So, for example,
00:17:44
you'll have more you'll expect more
00:17:45
yards on your drive if you start deep in
00:17:47
your in your own territory. You have
00:17:48
more more room to go. Um, and uh I don't
00:17:53
know whether that number means anything.
00:17:54
It's like a peripheral that's correlates
00:17:56
with with team quality, but they were
00:17:58
number one in it. And and usually teams
00:18:00
that historically teams with that level
00:18:02
of success in terms of yardage have been
00:18:05
have been great teams. So they they're
00:18:07
essentially that one number suggests
00:18:09
that the underlying metric suggests that
00:18:10
the KC is much much better than they
00:18:14
than they appear based on their record.
00:18:15
They've lost a bunch of one-run games.
00:18:17
They've had some bad special team
00:18:19
turnovers. They've had some what you
00:18:21
might just jump into is and lump into
00:18:23
one big pile called bad luck. And bad
00:18:26
luck is not supposed to continue. You
00:18:28
should expect neutral luck. So, if their
00:18:30
underlying posh strength parameters are
00:18:33
excellent, you might and given their
00:18:34
historical performance and and of course
00:18:36
who their quarterback is and the fact
00:18:38
that everybody seems to be beating up on
00:18:40
everybody else, maybe they do land that
00:18:42
land that high. But I have to tell you,
00:18:44
I'm surprised. I am surprised to by the
00:18:46
way, the only other team that's six and
00:18:48
five in the top like 12 teams, not
00:18:52
surprisingly, and this might be the role
00:18:54
of Prior. Well, who would you guess
00:18:55
who's the other six and five team that
00:18:57
the Priars are going to bring Ray up to?
00:19:00
Who?
00:19:00
>> Bills.
00:19:02
>> That's a great question. It's not the
00:19:03
Bills. The Bills are seven and four, by
00:19:05
the way.
00:19:06
>> Four. Okay.
00:19:06
>> It's the Ravens.
00:19:07
>> The Ravens, right? There's the Ravens.
00:19:09
Another one. They they they also have
00:19:11
prior history. And I wonder what this is
00:19:13
ESPN. The ESPN might have a lot of
00:19:15
weight on
00:19:16
>> Well, that's why I was asking you
00:19:17
because I think most people would find
00:19:19
it shocking that the Chiefs are the top.
00:19:22
I think mo a lot of people might find it
00:19:23
surprising that essentially the Eagles
00:19:25
and Ravens are equal. The Ravens are
00:19:28
better than the 8 and3 Seahawks, the
00:19:30
seven and four ners, the 7-4 Bills. The
00:19:33
9 and2 Broncos are like 12th.
00:19:38
>> The Patriots at 10 and two are like
00:19:41
15th.
00:19:42
>> Yeah. Yeah.
00:19:43
>> So either way, I thought it was just
00:19:45
interesting to talk about that.
00:19:46
>> They made a dip twice probably.
00:19:48
>> By the way, let me just say for all of
00:19:50
our sports, so as everyone knows, we
00:19:51
record Wharton Moneyball on Tuesday.
00:19:53
This is Tuesday, two days before
00:19:54
Thanksgiving.
00:19:56
There's an amazing schedule on
00:19:58
Thanksgiving, Audi. I don't know how
00:20:00
much you're going to get to watch. Um,
00:20:02
the first game is Lions and Packers. I
00:20:04
mean,
00:20:05
>> that's a great game.
00:20:06
>> That's a big game.
00:20:07
>> Lions traditionally play on
00:20:08
Thanksgiving. I guess there's more games
00:20:10
than one now. They used
00:20:10
>> the Lions always do, but there's they're
00:20:13
seven and four and the Packers are 73
00:20:14
and one. The second game all of a sudden
00:20:17
now became fascinating. Audi Chiefs at
00:20:20
Cowboys.
00:20:21
>> Wow. With the Cowboys beating the Eagles
00:20:23
last year.
00:20:23
>> So, six and five. I know with six and
00:20:27
five Chiefs against five five and one
00:20:29
Cowboys. A week ago was like they're
00:20:31
both going to lose. It's going to be two
00:20:32
losing teams. Now all of a sudden these
00:20:34
teams are in it. And the night game is
00:20:37
an interesting one because Joe Burrow's
00:20:38
coming back. It's Bengals at Ravens. So
00:20:41
all of a sudden we actually have three
00:20:43
sort of interesting games on
00:20:44
Thanksgiving Day. Hey, one thing I'm
00:20:47
going to take advantage of because it's
00:20:48
you and me here. So I I want to talk to
00:20:50
you about the Hall of Fame, the baseball
00:20:51
hall of
00:20:52
>> course. How could WE NOT, ERIC? IT'S OUR
00:20:54
DREAM.
00:20:54
>> We're go. We're going to But okay, so I
00:20:58
want to remind everybody there's two
00:21:01
separate committees that are going to be
00:21:03
voting this year. Okay.
00:21:06
One is what's called the contemporary
00:21:09
era
00:21:11
committee which used to be called I
00:21:14
don't know if it was the oldtimers
00:21:15
committee or anything but there's
00:21:16
>> veterans committee. Was it just
00:21:17
>> veterans committ? No. No. But they've
00:21:18
split it up. Remember Audi? There's 1980
00:21:21
onwards which is the contemporary era
00:21:24
candidates and then there's pre980.
00:21:27
So here are the seven candidates in the
00:21:30
contemporary era committee. Okay.
00:21:35
Roger Clemens.
00:21:38
So we agree he's and by the way you need
00:21:40
75% I think of like 16 voters. So not
00:21:46
>> he's not going to get it. Okay. Carlos
00:21:48
Delgado.
00:21:50
>> Carlos Delgado. I don't I don't see him
00:21:53
getting it. I don't I wouldn't imagine
00:21:54
he deserves it either. How about you?
00:21:55
What do you think?
00:21:56
>> 473 home runs.
00:21:58
>> Yeah.
00:21:59
>> Uh three time silver slugger. I don't
00:22:02
know. Right on the border. Jeff Kent.
00:22:06
>> On the border also. You know, this is a
00:22:08
Yeah. I mean, Kent, uh Kent, what was
00:22:11
he? His was an middle infield position,
00:22:13
wasn't he? Kent.
00:22:14
>> Second baseman. I think he has the I
00:22:16
think he has the most home Yeah, he has
00:22:19
I it says here in my notes he has the
00:22:21
most home runs ever by any second
00:22:23
baseman.
00:22:24
>> I think he has a better shot than Dill
00:22:26
Do
00:22:27
>> Don Maddingley
00:22:29
by the numbers his career was too short.
00:22:31
I just can't I can't I can't sanction
00:22:33
it. I mean he he definitely was the best
00:22:36
hitter in baseball for about two to
00:22:37
three years. um and super competitive
00:22:40
hitting you know 340s winning batting
00:22:42
title uh you know 30s one year I think
00:22:46
145 RBI's a statistic we like
00:22:48
>> I remember that year
00:22:48
>> like disparaged but is impressive great
00:22:51
great fielding first baseman but just
00:22:53
didn't have the longevity I mean he just
00:22:56
just you can't have o only if if you're
00:22:58
going to be in only on peak you have to
00:23:01
have Griffy level peak Kofax level peak
00:23:06
you can't
00:23:06
>> judge level peak
00:23:07
>> judge level peak Trout level peak. Look
00:23:09
at Trout. I mean,
00:23:10
>> yeah, Trout
00:23:11
>> go into the Hall of Fame because of 10
00:23:12
year first 10 years of his career.
00:23:14
>> That's a long peak. All right. How about
00:23:16
How about
00:23:17
>> Dale Murphy?
00:23:18
>> Also, first baseman. I don't see I I
00:23:22
don't see it happen.
00:23:22
>> All right. Gary Sheffield.
00:23:25
>> Chef Chef was quite a hitter. I think
00:23:27
his career statistics might be a little
00:23:28
higher. I don't have them in front of
00:23:29
me, but
00:23:29
>> 509 home runs.
00:23:31
>> Yeah. And there's a majesty with the 500
00:23:33
homers, right?
00:23:34
>> 1,676
00:23:35
RBI's. I think he's got the a good shot.
00:23:38
>> Okay. Fernando Venuelo.
00:23:40
>> No.
00:23:42
>> Okay. Just No. Okay. Well, those are the
00:23:44
seven contemporary. So, you could see in
00:23:47
your mind, you wouldn't be shocked if
00:23:49
Jeff Kent got in.
00:23:50
>> No, I got you.
00:23:51
>> You would be shocked if Gary Sheffield
00:23:52
got in.
00:23:53
>> Nope.
00:23:54
>> Okay.
00:23:54
>> I'm not I'm not I to listen to our
00:23:56
listeners. I'm not staring at their
00:23:58
numbers and uh I just I'm just recalling
00:24:00
what I imagine about these players. And
00:24:02
I do think Chef he had he had 500
00:24:04
homers. Um,
00:24:06
>> that that's that's a mark that almost
00:24:08
always gets you into the Hall of Fame.
00:24:10
>> Kent's leading home runs in a second
00:24:12
base position is really impressive.
00:24:14
>> Don't remember if there were uh PhD
00:24:17
performance-enhancing drugs rumors about
00:24:19
Sheffield. I think there were, which may
00:24:21
be wellkeeping,
00:24:22
>> maybe why he's where he is
00:24:24
>> because you would think someone with 500
00:24:25
home runs and almost 1,700 RBI's
00:24:28
>> going to shoot in, right?
00:24:29
>> Is in. I mean, that's in. It might be
00:24:31
third tier, but you're in. Let's talk
00:24:33
about the current ballot.
00:24:36
>> Ah, interesting.
00:24:37
>> Now, I don't think there's anyone coming
00:24:39
in on the bot on the as first timers.
00:24:41
>> Yeah. So, they're not in my list. I
00:24:43
looked at them and I just pruned them.
00:24:45
There's nobody. Nobody. Trust me,
00:24:47
nobody.
00:24:48
>> Mhm.
00:24:49
>> It's a great year for people who hanging
00:24:51
on. So,
00:24:52
>> let's talk about a couple hang Let's
00:24:53
talk about a couple hangers on and let's
00:24:55
see if you think. So, Carlos Beltran,
00:24:58
>> yeah, this is his year. He was he was
00:25:00
close last year with nobody. He's
00:25:03
getting in. Yeah, this is here.
00:25:05
>> Andrew Jones in his ninth year.
00:25:08
>> Uh, it's either this year or next year
00:25:10
he's getting in. So, I would go for this
00:25:11
year because I mean, uh, what's the
00:25:13
early reports? Do we have the early
00:25:15
reports yet? I know.
00:25:16
>> I haven't even looked. I know you like
00:25:17
to look at that. I have not.
00:25:18
>> You like to look at them? I I like to
00:25:20
look at I like to look at the Hall of
00:25:21
Fame traffic.
00:25:21
>> And that's probably it, right? Because
00:25:23
of how far the other people like The
00:25:24
next person on the list is Chase Utley.
00:25:26
He's
00:25:27
>> Chase Utley, I believe, will make it
00:25:28
eventually,
00:25:29
>> but he's in his third year at 39.8%. So
00:25:32
he's not jumping to 70.
00:25:33
>> He's not getting anywhere close quite
00:25:35
yet. Um so uh there aren't very many
00:25:38
ballots out yet right now. So um uh I I
00:25:42
right now there looks like almost
00:25:44
nothing out. So
00:25:45
>> okay. And then there's people I mean
00:25:47
some other reasonable name you know look
00:25:48
other names that aren't getting in
00:25:49
because of possibly
00:25:51
performance-enhancing drugs. Alex
00:25:52
Rodriguez, Manny Ramirez, obviously they
00:25:55
would be in based on their numbers. Andy
00:25:57
Pettit's in his eighth year won't make
00:25:59
it. Here's a guy, you know, I've never
00:26:01
quite understood. If you talk about peak
00:26:03
performance over What about King Felix,
00:26:07
>> he was great um for a short period of
00:26:10
time.
00:26:10
>> Well, I mean, extraordinary
00:26:11
>> question. Is Deg Grom in the Hall of
00:26:13
Fame?
00:26:14
>> Yes, Deg Gro's I think. I think
00:26:16
>> Well, let me see. I'm going to, you
00:26:18
know,
00:26:18
>> Deg Gro better than King Felix.
00:26:20
>> You know what? This is This is up my
00:26:22
alley because I've done this research.
00:26:23
So, I'm going to give you I I can So,
00:26:26
Deg Grom hasn't retired yet. So, um, I'm
00:26:29
going to I'm going to I'm going to pull
00:26:30
them up. I This is Yeah, as many of you
00:26:32
know, I I worked with Ryan Brill. We
00:26:34
created our Grid War metric, uh, which
00:26:36
ranks all which ranks every starting
00:26:39
pitcher based on their their quality.
00:26:41
Um, so I'm going to take a look to see
00:26:43
where Deg Grom falls. So the way our
00:26:47
metric works is it take the um the
00:26:50
geometric average of the rank in your
00:26:53
peak and your based on what?
00:26:56
>> So your rank in so the grid war so our
00:27:00
it's a long metric so your so I have a
00:27:02
annual grid war metric and you have your
00:27:04
career grid war metric um so that that's
00:27:08
how we calculate it. So um and so the so
00:27:11
if you want to just look our metric puts
00:27:13
Greg Maddox at number one. Can you just
00:27:14
speak by the way can you just just for
00:27:16
one second can you say to people the
00:27:18
advantage of taking a geometric mean as
00:27:21
opposed to an arithmetic mean here like
00:27:23
just take this why don't you just take
00:27:24
the simple average
00:27:26
>> yeah you could do that the advantage of
00:27:27
doing that is it doesn't if you have um
00:27:30
so the problem with that is that let's
00:27:32
say take Sandy Kofax Sandy Kofax is one
00:27:34
in peak um peak uh rank he's number one
00:27:40
um and uh he is 48 in career so if you
00:27:43
just took the the average you're not
00:27:45
giving enough quality enough enough kick
00:27:48
to that number one. And so in other
00:27:50
words, you you essentially it's a
00:27:52
decaying term.
00:27:53
>> And the geometric mean puts more kick
00:27:55
into peak performance.
00:27:56
>> Wait. So the geometric mean for for
00:27:58
right. So so it just it basically says
00:28:00
that if you're really really kind of
00:28:02
low, you're just low. You're not. And so
00:28:04
if you're 48 or 100, it's not that
00:28:06
different because it's it's the square
00:28:09
root metric. Um, and that's and it
00:28:11
really values a high value and and it
00:28:13
doesn't un doesn't destroy you with an
00:28:15
outlier on one of them. So, it's a
00:28:17
little bit uh I think it has slightly
00:28:19
better properties. Um, and by the way,
00:28:21
it it's extremely efficient. It it it's
00:28:24
everybody's in until you get to a point
00:28:26
where everybody's out. So, it's uh
00:28:29
>> you're telling me it has it has
00:28:30
literally a perfect set.
00:28:32
>> No, it's not perfect.
00:28:33
>> No, it has perfect. It has it has one
00:28:35
person, which we've talked about
00:28:37
repeatedly, who's not in it.
00:28:38
>> Not Kevin Brown. It's Kevin Brown. He's
00:28:40
the only exception. He's s
00:28:42
>> You're telling Oh, wait. You're telling
00:28:43
me I just want to be clear for our
00:28:44
listeners here on Wharton uh Wharton
00:28:46
Moneyball by rank order person people
00:28:49
pitchers by your metric. There literally
00:28:52
is a line I can cut where everybody
00:28:55
except for Kevin Brown is in and
00:28:56
everyone below that number is
00:28:58
>> not everyone. No, no, no, no, not every.
00:29:00
There's a couple exceptions. Um, but it
00:29:02
pretty much it goes from almost
00:29:03
everybody above except for Kevin Brown
00:29:06
and almost everybody below is not, but
00:29:08
there are a couple exceptions. So, Phil
00:29:09
Negro is uh is
00:29:11
>> and Catfish Hunter, they're not at
00:29:14
exactly at the border. Perchilling is
00:29:16
not I have to exclude those guys, those
00:29:18
guys who didn't get in because of
00:29:20
reputation. But you go down and you get
00:29:22
every single person all the way down the
00:29:24
list and to skipping over Kevin Brown.
00:29:27
You get Max Cerver, he's Sebathia, he's
00:29:29
these guys are are in now. Sherzer's in.
00:29:31
And then the one the one borderline
00:29:33
case, this is my favorite one, the
00:29:35
actual border one, the guy who sits on
00:29:36
the border who's nodding is Dave Steve.
00:29:39
>> Well, I know you've talked about Dave
00:29:41
Steve so many times. Look, you've come
00:29:44
to the conclusion that Kevin Brown is
00:29:45
more deserving than Dave Steve. I
00:29:47
>> He's more deserving. Yes.
00:29:48
>> Okay.
00:29:48
>> Yeah.
00:29:49
>> So, is there a a planet that we live on
00:29:52
where this veterans committee or
00:29:54
whatever they call current era committee
00:29:56
like Kevin Brown or Dave Steep gets in?
00:29:59
>> Absolutely. the it really depends on how
00:30:01
analytics focused they are, right?
00:30:03
Because the case for Kevin Brown and and
00:30:05
for Dave rests on having extraordinarily
00:30:09
good peak performance and nice long
00:30:11
careers. There's peak performances are
00:30:13
for both of them are majorly undervalued
00:30:16
and they're undervalued because of win
00:30:18
loss records.
00:30:20
Well, particularly with Kevin Brown,
00:30:21
these years he play he pitched for the
00:30:22
Padres's. He was a 500 slightly better
00:30:26
than 500 pitcher, but was the dominant
00:30:28
pitcher. Let's make you make a
00:30:30
prediction now. So, Eric Bradlo and his
00:30:32
three sons, I'm at Coopertown like I am.
00:30:35
It's July whatever 24th or 25th next
00:30:37
year.
00:30:38
>> You're telling me I'm going to see
00:30:40
Carlos Beltron.
00:30:42
>> You're telling me you think I'm going to
00:30:43
see Andrew Jones
00:30:45
>> mentally. You're saying and I might see
00:30:48
Jeff Kent and I might see Gary
00:30:50
Sheffield.
00:30:51
>> That's right. That's about it.
00:30:53
>> Okay.
00:30:54
>> Well, I found King Felix. By the way,
00:30:56
King Felix is number 42 ranked uh and
00:30:59
there's nobody ahead of him. There's a
00:31:02
whole lot of people ahead of him who
00:31:03
aren't in. So,
00:31:04
>> that were not in that are not in
00:31:06
>> were not in and and I don't think he's
00:31:08
going to and if you just look purely
00:31:09
based on the on the quality, I don't
00:31:11
think you're you're seeing it.
00:31:13
>> I see. I see. But uh but you know he had
00:31:16
a lot of he had a lot people sort of
00:31:17
loved him so you never know.
00:31:19
>> I got to admit if you tell me I'm going
00:31:20
to Coopertown to see Beltron Jones Kent
00:31:24
and Sheffield I'm thinking you know the
00:31:26
s you know I have tears of Hall of Fame.
00:31:28
The sum of those four is 12. I don't
00:31:30
care what you tell me. Like there's no
00:31:33
question.
00:31:33
>> No I'm just saying that doesn't excite
00:31:35
me that much because
00:31:36
>> No, it doesn't.
00:31:36
>> I mean it's not it's not that exciting.
00:31:38
Well Audi, we've talked about NBA. We've
00:31:41
talked about some NFL and rankings.
00:31:42
We've talked about some MLB. I'm glad to
00:31:44
hear again about your metric with Ryan
00:31:46
Bril. Uh this has been the first half of
00:31:48
Wharton Moneyball here on the Wharton
00:31:50
podcast network. Stay with us after the
00:31:52
break and we're going to talk to Audi
00:31:53
about uh him and his students and the re
00:31:55
research they're doing. So come and join
00:31:57
us after the break.
00:31:59
Welcome back to the second half of our
00:32:01
show here, Wharton Moneyball, the
00:32:02
Wharton podcast network. This is Eric
00:32:04
Bradler, professor of marketing,
00:32:06
statistics, and data science. I'm here
00:32:07
with my colleague, co-author, co-author
00:32:09
and friend Audi Winer, professor of
00:32:11
statistics and data science. Some
00:32:13
combination of the two of us Cade Massie
00:32:15
and Shane Jensen are here every week on
00:32:18
Wharton Moneyball. And as I mentioned at
00:32:19
the beginning of the show, one of the
00:32:21
advantages, although we love it when
00:32:23
everyone's here, one of the advantages
00:32:24
when it's just Audi and me is I
00:32:25
basically get to interview him. And so
00:32:27
we I just talked to him about some
00:32:28
statistical stuff having to do with the
00:32:30
MLB, NFL, NBA in the first half of the
00:32:33
show. Now I thought um let's pull back
00:32:36
the curtain on what AI has spent I don't
00:32:38
know at least the last 10 plus years
00:32:40
maybe 15 plus years building which I
00:32:42
consider the greatest
00:32:45
undergraduate mast's MBA PhD level
00:32:49
research opportunity for people that
00:32:51
want to apply statistics machine
00:32:53
learning now AI data science more
00:32:56
broadly uh to sports research um so AI
00:32:59
um I have no guide to you except you
00:33:02
know why don't you start by telling us
00:33:04
one of the projects that you're working
00:33:06
on now or recently that excited you and
00:33:08
we'll get to as many as we can in the
00:33:09
second half of the show.
00:33:10
>> So that's great opening just to give you
00:33:13
the listeners a little bit of
00:33:14
introductions. We have uh we have many
00:33:16
many students who are doing research in
00:33:17
in statistics, machine learning,
00:33:19
computer science, mathematics and their
00:33:21
their research is in some sort of sports
00:33:23
application which is a really um it's a
00:33:27
exciting for them because they they're
00:33:28
they're close to the edge, right? And
00:33:30
one of the way one of the way we
00:33:31
introduce our in our seminar when
00:33:33
they're research is think about
00:33:35
something that's that's that's caused
00:33:36
you to be think think about a problem.
00:33:40
What what is what what is what are you
00:33:42
curious about and and then try to get
00:33:44
the data. So we have lots and lots of
00:33:46
projects going on. Most of these are are
00:33:48
uh some of them already been published.
00:33:50
Um some of them have won prizes across
00:33:52
different sports. So we have it
00:33:54
surprisingly we don't have any
00:33:55
basketball but we have rug rugby a few
00:33:58
in in football we have tennis um we have
00:34:02
um we obviously we have baseball and we
00:34:04
have um in and question let me ask a
00:34:07
question related to that. So, one of the
00:34:09
things in the, you know, in our open
00:34:12
source publishing that we have now,
00:34:14
let's imagine there's one of our
00:34:15
listeners on Morton Moneyball that
00:34:17
wanted to either replicate or uh extend
00:34:21
some of the work that you've done with
00:34:22
your students. Like, are these data sets
00:34:24
public? And number two, if you publish
00:34:27
the paper, do you also publish code and
00:34:29
data with it? And like, you know, which
00:34:31
some journals require now, some don't.
00:34:33
like how would one of our listeners if
00:34:34
they wanted to say I want to see how my
00:34:36
skills are. I want to replicate winer
00:34:38
and brill or somebody else how would
00:34:40
they do that?
00:34:40
>> So um depends on the sport. So
00:34:42
everything we've gotten from football is
00:34:45
either public through event data uh next
00:34:48
generation stats just public event data
00:34:51
or from the NFL big data bowl. So the
00:34:53
tracking data we've used that if it's
00:34:56
soccer there is public event data but
00:34:59
our we have two soccer projects which
00:35:01
I'll happily talk about in a moment that
00:35:03
comes from our partnership with PFF so
00:35:05
pro football focus FC um you know their
00:35:08
their soccer kind of arm they have
00:35:11
collected their own tracking data using
00:35:13
video and they've let us have one season
00:35:16
and now we're about to sign on for a
00:35:18
second season. I'm not sure we'll be
00:35:20
able to make that data public, but we
00:35:21
will make code and everything related to
00:35:23
the analysis um public. So, actually
00:35:25
that's a good place for us to start
00:35:26
because we just submitted a paper. So,
00:35:29
this is probably one of our highest
00:35:30
level teams because it includes um
00:35:33
Jonathan Pippi who's a second year PhD
00:35:35
student. It includes Tion Shu who's a um
00:35:38
uh who's a master's in data science and
00:35:40
he's finishing his second year applying
00:35:42
to PhD programs right now. And it also
00:35:44
includes Paul Sabin who's our our senior
00:35:46
fellow. and they've worked on this uh
00:35:49
this this soccer tracking data to do
00:35:52
something which is uh which they call XG
00:35:54
plus. Um so most people have heard of XG
00:35:57
in soccer. XG is the expected goals
00:36:00
across a game and so that way you get
00:36:03
credit for goals that aren't goals. So
00:36:05
shots every shot has a certain
00:36:07
probability.
00:36:08
>> Is XG typically computed at the player
00:36:10
level or the team level? And well, it's
00:36:12
computed on a shot level and it's you
00:36:14
can associate them with players if you
00:36:16
like, but this is reported at the team
00:36:18
level. So, it's usually reported at the
00:36:19
game level
00:36:20
>> and it's usually an underlying metric.
00:36:22
So, if your if your team if if you you
00:36:25
can lose the game in F in XG but win the
00:36:28
game in goals
00:36:29
>> and that happens a lot. And so ex
00:36:31
>> would you since I'm trying to play as
00:36:33
host here would you relate this somewhat
00:36:36
to what in the first half of the show
00:36:37
you talked about about the Kansas City
00:36:39
Chiefs like maybe like there's some
00:36:41
underlying metric you're in here it's XG
00:36:44
plus before it was you know average
00:36:45
length of drive you know maybe that's a
00:36:49
bet there's lots of things as we all
00:36:50
know that are better indicators
00:36:51
necessarily of strength than the actual
00:36:54
randomness that happens with outcomes
00:36:56
and wins and losses
00:36:58
>> abs is exactly a parallel and this is a
00:37:00
huge leap up in soccer evaluation
00:37:03
because there are very very few goals in
00:37:06
soccer and a game can go one zero 0 with
00:37:08
a shootout and what are you going to do
00:37:10
with that but xG can be can accumulate
00:37:13
and in fact an individual player can
00:37:14
have substantial xG. Now usually in
00:37:17
sports we compare actual goals to XG and
00:37:21
we use that to attribute some special
00:37:23
quality um to the player as if they
00:37:26
consistently do that. One of the things
00:37:27
that we've learned in soccer is that
00:37:30
with maybe well certainly with one
00:37:31
exception, maybe a couple others, very
00:37:34
very few people seem to outscore their
00:37:37
xG. Their XG is just whatever is you
00:37:41
create, by the way, you create your own
00:37:43
XG. So that you get responsibility for
00:37:45
that.
00:37:45
>> By the way, how give our listeners a
00:37:48
sense of how that's computed? Is it I
00:37:51
would imagine you'll just tell me if I'm
00:37:52
wrong. I would imagine um where you're
00:37:55
shooting from is part of this, right?
00:37:57
>> The distance to the goal. Yep.
00:37:59
>> Yep. I would imagine possibly. I don't
00:38:00
know. Is the angle
00:38:02
>> angle? Yep.
00:38:03
>> Okay.
00:38:04
>> Absolutely. People were representative
00:38:06
by how much of the goal is visible,
00:38:08
right? So you take your angle and they
00:38:09
they can calculate how much of the how
00:38:12
much of of the 360 would be the goal,
00:38:14
right?
00:38:14
>> The location of the defenders.
00:38:17
>> Absolutely. They have a metric for that.
00:38:18
I mean, it's complicated and it's and
00:38:20
these are often proprietary, although we
00:38:22
built our own. um how how you actually
00:38:24
take the defenders in in to create that.
00:38:26
But basically often what they do is is
00:38:28
is essentially how much of the area is
00:38:30
open, right? So
00:38:31
>> yeah, you can literally physically say
00:38:34
if each player has a certain radius and
00:38:36
you have a certain angle and distance,
00:38:38
there's a certain amount of openness and
00:38:39
there you go. You could
00:38:41
>> So XG metrics can get very complicated
00:38:43
because you could take a look at, you
00:38:45
know, what happened immediately before
00:38:46
the shot and integrate that in as well.
00:38:48
Like uh are you almost like shooting off
00:38:50
of the dribble? Are you shooting off of
00:38:52
immediate pass? Um the Zaxis, the height
00:38:54
of the ball could be in it. Um but
00:38:57
usually they are not primary drivers are
00:39:01
defenders angle and distance and
00:39:03
distance is
00:39:04
>> look all of our listeners are saying
00:39:05
Eric when are you going to ask him
00:39:06
what's the secret sauce? So what's XG
00:39:08
plus you guys? I mean it's got to be
00:39:10
better than just XG. You call it XG plus
00:39:13
otherwise.
00:39:14
>> All right. So, just to just to close the
00:39:16
book on XG, uh Messi is the only player
00:39:19
who consistently outgo his xG and by a
00:39:22
lot. Um he's just you can see that just
00:39:25
immediately. And in fact, this past
00:39:27
Nessus or New England sports um um uh
00:39:30
conference at Harvard, he just stuck out
00:39:32
in someone's analysis and that that
00:39:34
everybody understands. Um okay, so what
00:39:36
is XG+? So what our what our what our
00:39:39
team did is they asked the simple
00:39:40
question XG is only calculated on shots
00:39:43
taken
00:39:45
and there are lots of times where a shot
00:39:47
can be taken but for variety of reasons
00:39:50
isn't and that isn't account accounted
00:39:54
for in XG.
00:39:55
>> Very interesting. So what they did is
00:39:57
they they calculated at any instant what
00:40:00
is the probability of the shot being
00:40:01
taken and then if that shot would take
00:40:05
were taken at that point that would
00:40:06
generate an XG. So you could actually
00:40:09
essentially integrate over over
00:40:11
continuous time this this this quantity
00:40:15
which is your almost and in most
00:40:17
instances the probability of a shot
00:40:18
taking is is zero. So it doesn't
00:40:20
accumulate massively, but you'll end up
00:40:22
missing all this opportunity if you only
00:40:25
look at XG as opposed to
00:40:27
>> could I in XG plus could I penalize a
00:40:30
player who should have taken a shot but
00:40:33
chooses not?
00:40:33
>> Absolutely.
00:40:34
>> That's right. And so PE so players who
00:40:37
have all this opportunity that they
00:40:39
don't do, they'll get penalized in XG
00:40:41
Plus. and players who who who take the
00:40:44
shots that they should take um you'll
00:40:47
not only see them in XG but they won't
00:40:49
get penalized.
00:40:50
>> This is now the reason I love this
00:40:53
besides it's interesting now you could
00:40:56
have two different qualities of players.
00:40:57
It's almost like false negative false
00:40:59
positive. Some players are false
00:41:02
negatives. They should be shooting and
00:41:04
they don't. And some players are false
00:41:06
positive. They shouldn't be shooting and
00:41:07
they do. So you could actually decompose
00:41:10
someone's total XG plus I assume into
00:41:14
almost like false positives and
00:41:16
negatives.
00:41:16
>> You can do a lot with a metric and it
00:41:18
has a lot of the features that you'd
00:41:20
like to see in a metric which is it
00:41:21
predicts out of sample well it
00:41:24
correlates with lot it it passes the
00:41:25
snip test. The great players show up. Uh
00:41:28
this is lots of interesting produces
00:41:30
interesting results and they're just
00:41:31
getting started with it and I don't you
00:41:32
know I didn't actually collaborate with
00:41:34
the research at all. This is one of the
00:41:35
few projects that I had basically
00:41:38
nothing to to do with it. They just did
00:41:40
it and presented. But that large measure
00:41:42
is
00:41:43
there might as you know a there are a
00:41:45
lot of people that listen to us that are
00:41:46
academics like you and me. A lot of
00:41:48
people in practice. Let me let me take
00:41:50
it from each perspective.
00:41:52
>> First where does Jonathan Paul I forget
00:41:56
the other third person's name on the
00:41:57
project.
00:41:59
>> Yeah. Where do they try to publish this?
00:42:02
>> All right. So that's interesting. Um so
00:42:04
the there are lots there's a whole bunch
00:42:06
of sports analytics journals
00:42:08
>> and there also statistics journals and
00:42:10
there in some sometimes you can publish
00:42:12
in in in sort of operations research
00:42:14
journals or econ journals or math
00:42:15
journals. Um so there are um the obvious
00:42:19
candidate for a lot of research like
00:42:21
this would be the journal of
00:42:22
quantitative analysis and analytics and
00:42:24
sports which is probably the the most
00:42:26
prestigious sportsoriented statistics
00:42:28
journal. Um there are lots there are
00:42:30
others. There's a journal of sports
00:42:31
statistics and there's there's a whole
00:42:33
bunch of sports analytics journals. Um
00:42:35
it all depends on how substantive of
00:42:37
your statistical contribution is. Um so
00:42:40
and how and and and what you're doing
00:42:42
with that. So most of our um and so you
00:42:45
could also apply to you can publish in
00:42:46
the annals of applied statistics in
00:42:48
Jazza. Um these are these are very very
00:42:51
uh um elite journals in statistics. If
00:42:54
you're making a methodological
00:42:55
contribution to statistics as well as
00:42:58
just doing a really important insight
00:42:59
into sports, one of the things that we
00:43:01
always do, in fact, if you want to read
00:43:02
about XG+ without having to wait for the
00:43:05
full publication of the manuscript, we
00:43:07
have what's called uh Wharton Research
00:43:08
Notes on our on our Wasabi web page
00:43:11
which has a a popularization of the
00:43:13
content and you and they explain what
00:43:15
they've done without getting into the
00:43:17
nitty-gritty of the maths of the method.
00:43:18
I also imagine I don't know I assume
00:43:20
they may post it like on SSRN so there
00:43:22
may actually
00:43:23
>> yeah so the archive is what we've been
00:43:25
using that's that's usually the math
00:43:27
place for that you have any what do you
00:43:30
think SSRN archive
00:43:31
>> post stuff on SSRN but either one I as
00:43:34
long you know I'm always happy to post
00:43:36
you know stuff through the review
00:43:38
process and post it there and you know
00:43:39
>> yeah so you do SSRN um so we have lots
00:43:42
of others um I um so actually
00:43:43
>> tell me about another project
00:43:45
>> yeah so why don't we talk about one that
00:43:46
I could do really quickly um one of our
00:43:48
one of our undergraduates um is is
00:43:51
working on a a a rugby paper. So uh so
00:43:55
this is really interesting. So rugby is
00:43:57
a one of the most understudied um
00:44:00
sports. So his name is Kenny Watts and
00:44:02
he's working with Jonathan Pippen. Um
00:44:04
and so you think rugby is like football,
00:44:06
right? There should be a lot of
00:44:07
analytics, right? Rugby football
00:44:10
football has analytics, rugby has
00:44:11
analytics. Well, guess what? There's
00:44:13
nothing. So what do you if I asked you
00:44:16
what do you think is the first and most
00:44:18
important analytics paper in statistics
00:44:21
goes back around 25 years um in football
00:44:23
what do you think it was
00:44:25
>> wait what's the most in in football
00:44:28
>> not in football what question in
00:44:30
football was first analyzed with
00:44:32
analytics that was really influential
00:44:35
>> um the question
00:44:38
>> yeah the paper I know of
00:44:40
>> is
00:44:42
does the kicker suck or was at the
00:44:44
distance.
00:44:45
>> That's one. You know, that's not the one
00:44:47
I was thinking of. Um I
00:44:50
>> would you go for it on fourth and one?
00:44:52
>> That's one fourth down. Right. So, uh in
00:44:54
fact, it's called Do Firms Maximize and
00:44:56
it's by Ror um and it's it's written as
00:44:59
an economics paper, but it does two
00:45:01
things. It calculates an expected points
00:45:03
model for football given down and
00:45:05
distance and and and yards to go. Um how
00:45:08
many points are expected on that drive?
00:45:10
and he uses that to decide whether or
00:45:12
not teams are going for it at the right
00:45:15
rate on fourth down and he comes to the
00:45:16
overwhelming conclusion that they don't
00:45:18
even remotely. So what Kenny and
00:45:20
Jonathan did was they said can we do
00:45:22
that for for rugby?
00:45:24
>> So explain to me I don't know even know
00:45:26
enough about the rules of rugbys and
00:45:28
stuff like if you don't get a certain
00:45:30
amount or don't score you have to give
00:45:32
the ball to the other team. I always
00:45:33
thought as as long as you guys keep
00:45:34
possessing the ball and you're going
00:45:37
forward on the field you keep the ball.
00:45:39
>> All right. So the question that they
00:45:40
asked, which I never heard of because I
00:45:42
don't know rugby, was that if a if some
00:45:44
sort of penalty happens, and I guess
00:45:46
they happen a lot.
00:45:47
>> Yeah.
00:45:47
>> The team who the offensive team, the
00:45:49
team that has the ball that's subject to
00:45:50
the subject to this penalty,
00:45:52
>> they get to choose. They can try to kick
00:45:54
what you what amounts to a field goal,
00:45:57
>> which I think is worth three points.
00:45:58
>> It is.
00:45:59
>> Or they can kick it out of bounds and
00:46:02
then they get the ball
00:46:04
essentially about 20 yards downfield
00:46:06
wherever they kicked it out of bounds.
00:46:08
And it's kind of like going for it
00:46:10
because then they can then score a
00:46:11
touchdown which is worth six and then
00:46:13
and then they have an extra point or two
00:46:15
um in in that formulation.
00:46:17
>> Just be clear, your team has a penalty.
00:46:21
I get an opportunity to either kick it
00:46:23
for three or kick it down field and then
00:46:27
of course I may end up with zero. But of
00:46:29
course I also have the ball farther down
00:46:30
the field. So I'm more like
00:46:31
>> get six or seven.
00:46:33
>> That's the same thing.
00:46:35
>> It's very similar. Seems
00:46:36
>> exactly the same thing. And the question
00:46:38
became why had no one done this before?
00:46:40
And the answer is data hasn't been
00:46:42
available and the sport's far behind.
00:46:44
And so our our team Kent Kenny is uh
00:46:47
knows rugby. Um and uh he he asked let's
00:46:50
do it. And they built an expected points
00:46:51
model um very similar to the way the ROR
00:46:55
built his first one with linear
00:46:56
regression taking every every play as a
00:47:00
having a set of coariantss and and um
00:47:02
the outcome which are highly correlated
00:47:04
and have to deal with that and
00:47:06
calculated an expense points model and
00:47:08
then he basically calculated what is the
00:47:10
optimal decision and turns out that just
00:47:13
as we've seen in almost every sport they
00:47:15
are kicking too many field goals.
00:47:17
>> That's what I assumed you were going to
00:47:18
say.
00:47:19
>> Yep. There it is. So, that was a just
00:47:22
it's a wonderful example because it's it
00:47:24
combines um a really important question
00:47:26
that people want to know the answer to.
00:47:28
>> Um we're talking about undergraduates
00:47:29
here. So, they're trying to flesh out
00:47:32
techniques that are pretty standard at
00:47:33
this point and they apply them in a new
00:47:35
situation and they tell us something
00:47:36
that we almost expected to happen, but
00:47:39
now it actually quantifies it.
00:47:42
Can you give us a sense of like how many
00:47:45
points or how much win probability a
00:47:48
team is giving up? Like I always like
00:47:50
you know since I try to be the effect
00:47:51
size guy here.
00:47:53
>> So you know teams aren't optimal. Okay.
00:47:56
But are they like you've pointed this
00:47:57
out even with your work with Ryan Bril
00:47:59
like people don't go for it on forth not
00:48:02
necessarily always the right time but
00:48:04
sometimes it's not irrational depending
00:48:07
like you could come up with a risk
00:48:08
aversion story or you come up with an
00:48:10
uncertainty story that explains it. Is
00:48:13
it that you know sure they're kicking
00:48:15
too many field goals but they're
00:48:18
basically not losing much win
00:48:20
probability or maybe they are. I mean
00:48:21
how big an effect are we talking about?
00:48:23
that's got to be something on their
00:48:25
docket. One of the things that when I
00:48:26
deal with students is that I teach them
00:48:29
pretty aggressively that it's important
00:48:31
to finish and not do everything. And
00:48:35
>> oh, I thought you meant finish by a
00:48:36
different thing. So, let me I'm going to
00:48:37
just This is not a story.
00:48:39
>> What I mean is that they they they have
00:48:40
to they need to finish a b you you know
00:48:44
this. It's never done. Right.
00:48:45
>> Right. It's never done.
00:48:46
>> Never done everything. No, but the story
00:48:48
I was going to point out is why, you
00:48:50
know, when I while I publish obviously
00:48:52
in both statistics and marketing, in
00:48:54
some ways marketing's just different,
00:48:56
not harder or easier, just different
00:48:58
because let's say you built this
00:49:00
statistical model for something. Let's
00:49:03
say kicking field goals in rugby or
00:49:05
kicking it. Then someone would you could
00:49:08
never publish that. You'd have to go to
00:49:11
the next step. And so what does this
00:49:13
mean for firm performance? What does
00:49:14
this mean for winning? like they will
00:49:17
never let you like I I thought you meant
00:49:19
finishing by and so tell me what this
00:49:22
means as opposed to you have to draw the
00:49:24
line somewhere and say you've built a
00:49:26
good predictive model you've answered an
00:49:28
interesting substantive question you're
00:49:30
not yet you know you could extend it and
00:49:32
say oh what does it affect when
00:49:33
probability but at some point the
00:49:35
project just ends if you're interested
00:49:36
in that answer it again
00:49:38
>> that's a great question because actually
00:49:39
the the getting an effect size I think
00:49:41
is something that needs to be done
00:49:42
before they submit that that's too
00:49:46
important a question. So they built a
00:49:49
model, they have a a conclusion, they're
00:49:51
not they're not aggressive enough, and
00:49:52
now you have to kind of turn it into how
00:49:54
many points you're giving up or
00:49:56
something and which probably then turns
00:49:58
into wins pretty easily. So that piece
00:50:00
is is I think is important. But you
00:50:02
know, these models are going to be
00:50:04
incomplete. They're models, right?
00:50:06
You're not going to be bringing every
00:50:07
factor in. And um you could be look
00:50:09
doing looking at win probability models
00:50:11
instead of expected points models. And
00:50:13
there's many ways to do these things.
00:50:15
How you going to be treating the the the
00:50:17
how you dealing with the independence?
00:50:18
Are you how you dealing with the
00:50:19
standard errors? There's so many deep
00:50:21
questions that you could ask which
00:50:23
eventually should get get answered but
00:50:25
can't do them all in the first round.
00:50:26
>> By the way, Audi, I would have lost
00:50:27
about a million dollars. And let me tell
00:50:29
you why.
00:50:31
If you had told me that you and I were
00:50:33
going to talk about
00:50:36
related research, no. No. The first two
00:50:38
you would bring up would be soccer and
00:50:40
rugby. I'd be like, I'll give you a
00:50:44
million to one odds. So maybe you could
00:50:47
tell us about one that deals with the
00:50:49
students that I have a little more
00:50:50
knowledge. No, no, I'm not saying
00:50:52
there's anything
00:50:54
>> but
00:50:56
we have one in tennis which you I'll
00:50:57
just
00:50:58
>> I want to I love tennis. So get to
00:50:59
tennis. But before I do that, I want to
00:51:01
point out to our listeners this what
00:51:02
Audi's pointing out is important which
00:51:04
is I've always said I joke about it with
00:51:07
my friends. There's only one thing we as
00:51:09
academics do really well and that's we
00:51:13
recognize the isomorphism between
00:51:15
problems. So if you're not a rugby fan
00:51:18
and you're not a soccer fan, who cares?
00:51:20
What Audi's talking about is applicable
00:51:23
to so many other sports or so many other
00:51:25
problems. This is why sports is
00:51:27
wonderful as a testing ground for
00:51:29
statistical methods and learning because
00:51:32
it's, you know, you can change the name
00:51:35
from, you know, kicking field goals and
00:51:38
scoring touchdowns in one sport to
00:51:40
another or, you know, you have two
00:51:42
options A and B and one is more certain
00:51:44
but has lower highend. You know, that's
00:51:46
what I love about sports is that you can
00:51:49
learn a lot about business and business
00:51:51
decision making through sports type
00:51:53
models.
00:51:54
>> All right. So you're going to tell us
00:51:55
about tennis.
00:51:56
>> Uh so so tennis actually um this has
00:51:58
came up. These are this is a two two
00:52:00
students um did this um they uh they put
00:52:03
this together um and it's really it's um
00:52:06
and they they presented it and at
00:52:08
Carnegie Melon um so let me just tell
00:52:10
you who they are. It's I win Amita and
00:52:13
Audrey um Rita we call her and uh so I
00:52:16
win Rita and Audrey um they uh they
00:52:18
actually won the student competition
00:52:20
award uh for best paper at the Carnegie
00:52:22
Melon Sports Analytics Conference.
00:52:23
Great. And it was uh uh on tennis and
00:52:26
the feature of tennis was is um they
00:52:28
wanted to rank serves and the problem
00:52:30
the the problem with most ser service
00:52:32
ranks is that they don't actually
00:52:36
measure the service quality as distinct
00:52:39
from the player quality, right? Because
00:52:41
you just look how many service points
00:52:42
you're winning. Now you can look at
00:52:44
aces. Of course that's only one feature
00:52:46
of service. The other feature is getting
00:52:48
the the opponent off balance enough to
00:52:50
win the the the point fairly quickly.
00:52:52
but other ast as aspects of your play
00:52:55
kind of get involved. And so it's
00:52:57
essentially a decomposition of the um
00:52:59
the ser it takes you if you're winning a
00:53:02
lot of service points, it breaks it down
00:53:03
and to how much of that's due to the
00:53:05
quality of your serve and how much is
00:53:06
due to the quality of your player. Um
00:53:08
and it gets a
00:53:09
>> person serving or the opponent
00:53:12
>> uh both of them and adjust for opponent
00:53:13
as well. That was a key key feature. You
00:53:15
got you got for opponent.
00:53:17
>> So does it have a ranking of player
00:53:20
>> rank serves and it and it and it's
00:53:22
players
00:53:23
>> uh current I think it has some of that
00:53:24
current players I I'll see in there I
00:53:26
can look through their players and see
00:53:28
who they rank as excellent but I'm sure
00:53:30
you would enjoy that right so that's
00:53:31
>> I would I would enjoy knowing but it's
00:53:34
it is good to know because it's one of
00:53:35
those things where like simple metrics I
00:53:39
have a prediction of who the best server
00:53:41
is but it's not going to be someone if
00:53:43
if there is a table is there a table
00:53:45
should I guess somebody
00:53:46
>> uh I don't actually have it so you can
00:53:48
you can guess it for next time
00:53:50
>> I'm I'm gonna guess it's a lessernown
00:53:53
player who is 6'10,
00:53:56
>> right?
00:53:57
>> His name's Riley Opelka.
00:54:00
>> I'm going to guess that he and you
00:54:02
remember from our childhood Rosco
00:54:04
Tanner.
00:54:05
>> Well, it was Rosco Tanner. Yeah.
00:54:06
>> No, no. I'm just saying Rosco
00:54:08
>> 50 mph. Rosco Tanner.
00:54:10
>> Rosco Tanner could serve. Rosco Tanner
00:54:12
could really serve. That was the only
00:54:14
great.
00:54:17
But either way, that's a fascinating
00:54:19
project. Um, so yeah, and that and those
00:54:22
had great success and and it's funny
00:54:23
because I mean it's actually really
00:54:25
really nice to see uh these students all
00:54:27
work with me or they were all Moneyball
00:54:29
Academy TAs in part of our lab this
00:54:31
summer. Um, and that's where they did
00:54:33
their work. Um, I can finish up with a
00:54:34
few others. We have we have two
00:54:37
actually, you know, pretty good ones in
00:54:38
football.
00:54:38
>> Two more. Two more would be great.
00:54:40
>> Yeah. Well, so one one of them in
00:54:42
football. We have we have a soccer, we
00:54:43
have football, we have a baseball one
00:54:45
that's just getting started that that uh
00:54:47
we just submitted to Saber
00:54:49
>> question in the baseball one
00:54:50
>> and the baseball question actually it um
00:54:53
so it has to do with my uh historical
00:54:56
admiration of the RBI and and um so the
00:55:00
RBI statistic would know everyone likes
00:55:01
to make fun of because it's contextual,
00:55:03
right? It's it's so dependent on on the
00:55:06
the settings the opportunities you get.
00:55:09
So that that that just essentially is
00:55:12
just asking you to adjust for context.
00:55:16
So the question we asked was what are
00:55:18
your RBI's above expected?
00:55:22
>> That's very reasonable.
00:55:23
>> I I'd love to know like Lou Garri's 180
00:55:26
or
00:55:28
>> Lou Garrick the thing this is what
00:55:29
you're pointing out. The greatest season
00:55:32
I've ever thought about was when Lou
00:55:34
Garri hitund this 1927 the year Ruth hit
00:55:37
60. So, we know he came up at least 60
00:55:39
times with nobody on base.
00:55:41
>> He had 175 RBs. I think Ruth had 160
00:55:45
something. So, how many guys were
00:55:47
goddamn left on base by the time Garrick
00:55:50
got up there? And he still hit 175, I
00:55:53
think, is the number.
00:55:54
>> So, to me, that that's an obvious, you
00:55:56
know, you always ask for, isn't that the
00:55:57
one of the most obvious unanswered
00:55:59
questions in baseball? If you report
00:56:01
RBI, you should also be reporting RBI's
00:56:03
above expected immediately. And then you
00:56:06
can also um and this came from our
00:56:08
conversations with some of the analysts
00:56:09
at the Phillies. They want to know what
00:56:11
your RBI's above expected is after
00:56:14
controlling for the quality of player
00:56:16
that you are. So we know that right so a
00:56:19
lot of that RBI's above expected could
00:56:21
be due to the fact that you are a great
00:56:22
player. So that's that's that. So they
00:56:25
want to decompose the RBI above expected
00:56:28
into two pieces. one which is what you'd
00:56:31
expect to get given the quality that you
00:56:33
are and then the additional part which
00:56:35
was essentially be the luck or the um
00:56:38
you can call it luck or skill to drive
00:56:40
in those extra RBI or pressure
00:56:43
performance
00:56:45
expected part I can imagine a bunch of I
00:56:47
can imagine a statistical model but I
00:56:49
can also imagine just binning players
00:56:51
into quality buckets or tersiles let's
00:56:54
say and then say of the players in this
00:56:57
tersile based on some metric or
00:56:59
something or some per, you know, skill
00:57:01
estimate. Let's look at how many RBIs
00:57:04
above expected they have for their
00:57:06
decile. That that would be one way to do
00:57:08
it. Would you favor that over some sort
00:57:11
of let's call it parametric typo or some
00:57:13
sort of model?
00:57:14
>> Well, I I generally my my general
00:57:17
tendency is to use a parametric model
00:57:19
and then and essentially predict your
00:57:21
RBI's given your context and your say
00:57:24
WOBA. That's a simple way to do that. I
00:57:26
like your bucket. In fact, Eric, I'm I'm
00:57:28
gonna ask you. You want to be a you want
00:57:29
to be the faculty adviser to our to our
00:57:31
little team? They'd certainly love to
00:57:33
have you.
00:57:33
>> I'd love it. Let's do it.
00:57:35
>> So, we have we have we have two
00:57:36
undergraduates working on it right now.
00:57:38
Talia and uh Lev are are sophomore and
00:57:41
freshman. In fact, Leev um took
00:57:43
Moneyball Academy and he's now here as a
00:57:45
freshman and is already jumping in to do
00:57:47
research. Um I'll finish off with a
00:57:49
football um
00:57:51
in Harvard um and did a great job. They
00:57:54
did something that everyone is
00:57:55
interested in. You know, we we look at
00:57:57
wind pass uh you know that block block,
00:57:59
you know, brush win rate, right? That's
00:58:01
a standard statistic that is get
00:58:02
calculated. But guess what it's not
00:58:04
doing? It's not adjusting for um
00:58:08
quality, the quality of your opponent.
00:58:12
And that is something they want to do
00:58:14
and they so this team has tried to
00:58:16
adjust for the quality of the opponent
00:58:18
and the context and the actual what's at
00:58:20
stake, right? So failing to block in a
00:58:22
situation that doesn't really matter is
00:58:24
much less important or damaging than
00:58:26
failing to block or succeeding to block
00:58:28
and that leads to a pressure or sack. So
00:58:31
essentially they want to do is a much
00:58:32
more sophisticated evaluation of both
00:58:35
the blockers and the and the and the um
00:58:37
the rushers and they come out with great
00:58:38
great results and it's much better than
00:58:40
just you know success rate block you
00:58:42
know black uh block pass success rush
00:58:46
win rates. Well, the reason I love all
00:58:48
these stories is first of all, um, it's
00:58:50
great that you're giving back to our
00:58:52
students and giving them an opportunity
00:58:53
to do research. That's obviously number
00:58:55
one. That's great. It's great. And a lot
00:58:56
of these people, as you said, may go on
00:58:58
for research jobs in industry or PhDs,
00:59:01
etc.
00:59:01
>> Well, they're applying for statistics,
00:59:03
PhD departments,
00:59:05
>> I mean, that's that's incredible. Um the
00:59:08
second thing is it shows people that um
00:59:11
you know you don't need a PhD to do
00:59:15
interesting research too. And so that
00:59:18
part is is interesting. That part is is
00:59:21
really interesting as well. Well, this
00:59:23
has been one hour here of Wharton
00:59:25
Moneyball on the Wharton podcast network
00:59:27
on behalf of myself, Eric Bradlo, my
00:59:28
colleague, co-author and friend Audi
00:59:30
Winer uh in Absentia, Cade Massie and
00:59:33
Shane Jensen who'll be back next week.
00:59:35
Thanks to our associate producer and
00:59:37
sound engineer, Dion Simpkins. Thanks to
00:59:39
our producer and if you like our big
00:59:42
bosses, Marissa Rena and D Patel. Thank
00:59:45
you for joining us here on Morton
00:59:46
Moneyball between now and next week.
00:59:48
Enjoy your sports. Enjoy your
00:59:49
statistics. We'll see you back here on
00:59:51
the Wharton Podcast Network.

Badges

This episode stands out for the following:

  • 60
    Best overall
  • 60
    Best performance

Episode Highlights

  • Extraordinary NBA Season Start
    Eric discusses the remarkable 17-1 start of the Oklahoma City Thunder.
    “This is the most extraordinary start to a season I've ever seen.”
    @ 02m 20s
    December 01, 2025
  • OKC's Historic Potential
    The Thunder's current performance suggests they could break historical records this season.
    “They’re going to have to start talking about this OKC team as being one of the greater teams.”
    @ 12m 31s
    December 01, 2025
  • Thanksgiving Day Games
    Three interesting games are lined up for Thanksgiving: Lions vs. Packers, Chiefs vs. Cowboys, and Bengals vs. Ravens.
    “We actually have three sort of interesting games on Thanksgiving Day.”
    @ 20m 43s
    December 01, 2025
  • Hall of Fame Discussion
    A deep dive into the candidates for the baseball Hall of Fame, including Roger Clemens and Carlos Delgado.
    “There’s two separate committees that are going to be voting this year.”
    @ 21m 01s
    December 01, 2025
  • Peak Performance Metrics
    Discussion on the importance of peak performance in Hall of Fame considerations, particularly for pitchers.
    “The case for Kevin Brown and Dave rests on having extraordinarily good peak performance.”
    @ 30m 01s
    December 01, 2025
  • Understanding XG and XG+
    XG, or expected goals, accounts for shots that aren't goals, while XG+ includes potential shots not taken.
    “XG is only calculated on shots taken.”
    @ 39m 40s
    December 01, 2025
  • Rugby Analytics Breakthrough
    A team of undergraduates developed an expected points model for rugby, revealing teams often kick too many field goals.
    “Just as we’ve seen in almost every sport, they are kicking too many field goals.”
    @ 47m 19s
    December 01, 2025
  • Student Competition Winners
    Rita and Audrey won the best paper award at the Carnegie Mellon Sports Analytics Conference.
    “They actually won the student competition award for best paper.”
    @ 52m 18s
    December 01, 2025
  • RBI Above Expected
    A new statistical approach to evaluate RBIs in baseball, adjusting for context and player quality.
    “What are your RBIs above expected?”
    @ 55m 18s
    December 01, 2025
  • Sophisticated Football Evaluation
    A team developed a method to evaluate blockers and rushers by adjusting for opponent quality.
    “They want to do a much more sophisticated evaluation.”
    @ 58m 12s
    December 01, 2025

Episode Quotes

  • They’re on a 51 and 3 pace right now.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
  • That’s a great question. It’s not the Bills.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
  • That’s a long peak.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
  • XG can accumulate, and an individual player can have substantial XG.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
  • Players who should be shooting but don’t are false negatives.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
  • It's great that you're giving back to our students.
    NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data

Key Moments

  • Moneyball Intersection00:04
  • Extraordinary Start02:20
  • Oklahoma City Thunder02:29
  • Historic Potential12:31
  • Thanksgiving Matchups20:43
  • Hall of Fame Candidates21:01
  • Rugby Analytics43:55
  • Football Insights58:12

Words per Minute Over Time

Vibes Breakdown

Related Episodes

Baseball Analytics, NFL Parity, and College Football Playoff Odds
November 16, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:01:01
Baseball Analytics, NFL Parity, and College Football Playoff Odds
How AI and Analytics Are Changing Quarterback Evaluation and NFL Outcomes
January 08, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:07:55
How AI and Analytics Are Changing Quarterback Evaluation and NFL Outcomes
Bill Connelly on College Football Chaos, Coaching Carousel, and Predicting the Future of the Game
October 31, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
59:01
Bill Connelly on College Football Chaos, Coaching Carousel, and Predicting the Future of the Game
Inside College Football’s Data-Driven Evolution and Decision-Making
January 22, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:10:36
Inside College Football’s Data-Driven Evolution and Decision-Making
NBA Analytics, Tanking, and the Future of Team Building
February 19, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:04:12
NBA Analytics, Tanking, and the Future of Team Building
How Analytics and New Rules Are Changing Baseball
March 05, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:01:15
How Analytics and New Rules Are Changing Baseball
The Math Behind Sports Rankings and Golf Analytics
May 07, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:08:01
The Math Behind Sports Rankings and Golf Analytics
NBA Playoff Analytics, Victor Wembanyama, and the Hot Hand Debate
May 20, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:03:03
NBA Playoff Analytics, Victor Wembanyama, and the Hot Hand Debate
From Masters Victory to Motion Data: Golf’s Analytical Evolution
April 16, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:01:58
From Masters Victory to Motion Data: Golf’s Analytical Evolution
NFL Analytics Preview, QB Forecasts, and Team Rankings for 2025
August 05, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:05:14
NFL Analytics Preview, QB Forecasts, and Team Rankings for 2025
WNBA Searches Surge, Sports Finance Grows, and College Football Heats Up
September 26, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:04:37
WNBA Searches Surge, Sports Finance Grows, and College Football Heats Up
How the NFL Uses Data to Shape Rules and Create New Metrics
February 06, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:00:07
How the NFL Uses Data to Shape Rules and Create New Metrics