Search Captions & Ask AI

Baseball, Bias and Decision-Making

October 13, 2016 / 09:46

This episode features Wharton professor discussing expert decision-making in baseball, focusing on umpire calls and the influence of pitch count on those decisions.

The conversation begins with an overview of the professor's research on how umpires make calls based on the location of pitches and the data collected from stereoscopic cameras in major league ballparks.

Key findings reveal that umpires systematically adjust their strike zone based on the count, favoring batters or pitchers depending on the situation. This bias is explained as a trade-off for accuracy in decision-making.

The discussion also touches on the implications of this research for business practices, particularly in hiring processes, where biases can affect decision-making.

Finally, the professor shares his future research interests, including predictions related to elections and how different models can influence decision-making.

TL;DR

Wharton professor discusses how umpire calls in baseball are influenced by pitch count and the implications for decision-making in business.

Episode

9:46
00:00:01
today we're speaking with Wharton
00:00:03
professor 8 on green about his research
00:00:05
on expert decision-making and
00:00:07
specifically umpire calls in baseball so
00:00:10
it sound happy with us thanks for
00:00:12
joining us yeah my pleasure so can you
00:00:14
give us an overview of your research
00:00:15
yeah so generally what I'm interested in
00:00:17
is decision-making by experts expertise
00:00:20
particularly in realms for which there
00:00:22
are predictions available from machines
00:00:25
and machine based models algorithms or
00:00:27
in the case of umpires in baseball data
00:00:29
from stereoscopic cameras behind home
00:00:31
plate of every major league ballpark
00:00:33
that we use to benchmark the calls that
00:00:35
umpires make and so this is a great
00:00:37
setting for studying decision-making by
00:00:39
experts because we have experts who are
00:00:41
supposed to abide by a very specific
00:00:43
decision rule so the pitcher throws a
00:00:45
pitch if the pitch is in this imaginary
00:00:48
box the official strike zone defined by
00:00:50
the width of home plate on the floor and
00:00:52
the batter stance then the umpires
00:00:53
supposed to call a strike otherwise he's
00:00:55
supposed to call a ball and so what we
00:00:57
do is we use these data from the
00:00:59
stereoscopic cameras that take a
00:01:00
sequence of images of every pitch from
00:01:03
its release from the pitchers hand until
00:01:04
it crosses the region above home plate
00:01:06
to basically observe to what extent the
00:01:09
umpire abides by this decision rule to
00:01:11
make his calls based solely on the
00:01:13
location of the pitch and so I think the
00:01:15
most interesting thing that comes out of
00:01:16
the data is basically this deviation
00:01:19
from that benchmark and very systematic
00:01:21
way and so there's something in baseball
00:01:23
called the count the count is keeps
00:01:25
track of the sequence of pitches between
00:01:26
a pitcher and a batter over the course
00:01:28
of an at-bat if the count reaches for
00:01:30
balls that's good for the batter he
00:01:31
walks if it reaches three strikes that's
00:01:34
good for the pitch or the batter strikes
00:01:35
out and so what you see is that instead
00:01:38
of the umpire just using the location of
00:01:40
the pitch to make as calls pitches at
00:01:42
the same location are sometimes called
00:01:45
balls or sometimes called strikes
00:01:46
depending on the count and in particular
00:01:48
the strike zone expands dramatically
00:01:50
when the count favor is the batter and
00:01:52
so when account favors the batter the
00:01:54
umpire responds by favoring the pitcher
00:01:55
and vice versa when account favors the
00:01:57
pitcher the Empire responds by favoring
00:02:00
the batter and it's particularly extreme
00:02:02
so basically you can think about a pitch
00:02:04
that crosses say the top boundary of the
00:02:07
official strike zone so this pitch and
00:02:09
what I'll call baseline count the count
00:02:11
at the beginning of the advance euro
00:02:14
strikes this
00:02:15
an umpire calls a strike fifty percent
00:02:17
of the time it calls a ball fifty
00:02:18
percent of the time you can think of as
00:02:20
being a different between a ball and a
00:02:22
strike but when the counts say has three
00:02:24
balls and 0 strikes when it's strongly
00:02:26
favored is the batter well then this
00:02:28
pitch is almost always called a strike
00:02:29
and the reverse is true in the opposite
00:02:32
countless your balls and two strikes the
00:02:34
same pitch at the same location is
00:02:36
almost always called a ball so so why
00:02:39
does this happen so they're potentially
00:02:42
a number of stories that I can explain
00:02:43
this result let me tell you about a
00:02:45
particularly interesting and
00:02:47
counterintuitive one and the argument
00:02:49
here is that what the pitcher is do or
00:02:51
what the umpire is doing is he's trading
00:02:54
off accuracy for bias or rather he's
00:02:57
trading off bias for accuracy he's being
00:03:00
purposefully biased consciously or
00:03:02
unconsciously so he's varying the strike
00:03:04
zone that he enforces with the count
00:03:06
he's not making his decisions based
00:03:08
solely on the location of the pitch but
00:03:10
the argument is this actually helps them
00:03:11
make more accurate calls and why is this
00:03:14
the case well imagine yourself as an
00:03:17
umpire you're squatted behind the
00:03:20
catcher you're looking out over his head
00:03:22
towards the pitcher the pitcher winds up
00:03:24
he throws a 90 plus mile an hour pitch
00:03:26
it's there in an instant it has some
00:03:28
lateral movements and vertical movement
00:03:29
you have to decide whether this pitch is
00:03:31
inside or outside some imaginary box
00:03:33
it's an incredibly difficult problem and
00:03:36
if you relied only on your observation
00:03:39
of the location of the pitch you'd
00:03:41
probably make mistakes on a regular
00:03:42
basis frequently when the pitch is
00:03:44
closed it'd be hard to say whether it
00:03:45
was just inside the strike center just
00:03:47
outside the strike zone but fortunately
00:03:49
for you you have other information at
00:03:51
your disposal you have expectations that
00:03:54
you've built up over many years of being
00:03:56
a professional umpire expectations about
00:03:59
where the pitcher is going to throw in a
00:04:00
certain count and whether the batter is
00:04:02
going to swing and so for instance you
00:04:05
might reasonably expect that when the
00:04:07
count has three balls and 0 strikes that
00:04:09
is when favor is the batter that the
00:04:10
pitcher is going to try to throw a
00:04:11
strike and so if the pitch is close and
00:04:14
you're unsure whether it was just inside
00:04:15
the strike zone or just outside the
00:04:16
strike zone you may err on the side of
00:04:18
calling a strike the pitch that you
00:04:20
expect now think about what happens in
00:04:23
an 02 count so in this count you expect
00:04:26
that the batter is going to swing it
00:04:27
anything close
00:04:28
because if he doesn't he runs the chance
00:04:30
of striking out whereas he can prolong
00:04:32
liat bad if he fails the pitching off
00:04:33
for instance and so imagine you see a
00:04:36
pitch that appears close to you but the
00:04:38
batter chooses not to swing how can you
00:04:40
rationalize that decision well you can
00:04:42
rationalize it by saying that he
00:04:43
observes something that you didn't that
00:04:44
his vantage point was such that he
00:04:46
believed the pitch to be a ball and so
00:04:47
you might err on the side of calling a
00:04:49
ball and so this Bayesian updating this
00:04:53
basically rational way of processing
00:04:54
other information that you have creates
00:04:57
this trade-off between bias and accuracy
00:04:58
it helps the umpires become more
00:05:00
accurate at the cost of having them
00:05:01
systematically change the strike zone
00:05:03
that they enforce with this variable the
00:05:05
count that has nothing to do with their
00:05:07
directive so what would you say a
00:05:10
business practitioner should take away
00:05:12
from your research yeah so I think what
00:05:16
umpires are doing is they're
00:05:19
statistically discriminating so they
00:05:22
have a directive to make their calls
00:05:23
based solely on the location of the
00:05:25
pitch but that's very difficult to do
00:05:27
it's very hard to observe the exact
00:05:29
location every time and so what they do
00:05:31
instead is they say well this other
00:05:33
informations other information is
00:05:34
correlated with the location in the
00:05:36
pitch it could help me on average make
00:05:38
more accurate calls and so as I said
00:05:40
before they basically trade-off bias for
00:05:43
accuracy it's a statistical
00:05:45
discrimination at least opportunities
00:05:47
for statistical discrimination are just
00:05:49
they're everywhere and they're
00:05:51
everywhere in the workplace in
00:05:52
particularly in you know the hiring
00:05:53
process so for instance you know when we
00:05:56
hire we have a benchmark that sounds
00:05:59
very similar to the umpires directive we
00:06:01
want to hire the best person person is
00:06:03
going to do the best of the job there's
00:06:05
going to be the best fit but it's hard
00:06:07
in the interview process looking at a CV
00:06:09
or even interview i'm a person often to
00:06:11
decide who is the best or how good is
00:06:14
this person how good is this person
00:06:15
going to be in the job and so we may
00:06:17
rely on other factors factors that are
00:06:20
either implicitly or explicitly banned
00:06:22
that we shouldn't be using perhaps but
00:06:24
factors that we believe perhaps rightly
00:06:26
as in the case of the umpires or even
00:06:28
erroneously to give us information about
00:06:31
this person's fit and so if we're right
00:06:33
we're going to get a little more
00:06:34
accuracy but it's going to come look at
00:06:36
the cost of bias it's going to come at
00:06:38
the cost of systematic being able to
00:06:41
systematically
00:06:42
predict who we hire based on factors
00:06:44
that have nothing to do at least
00:06:46
directly with the dimension that we're
00:06:48
trying to hire along so what are you
00:06:50
going to look at next going to stay stay
00:06:52
in baseball or look elsewhere for
00:06:54
research yeah so I mean it's a baseball
00:06:56
is an opportunity to use machine based
00:07:01
models these cameras to say how good of
00:07:04
a job umpires are doing but I think
00:07:07
there are a lot of interesting cases in
00:07:11
which the decisions that individuals
00:07:13
make that experts make can be informed
00:07:15
by algorithms by machine based
00:07:17
predictions and so one of the things
00:07:20
that I'm interested in particular now is
00:07:23
making predictions about the election so
00:07:25
it's particularly timely I think a lot
00:07:26
of us are interested in the probability
00:07:28
that Hillary Clinton is going to wait on
00:07:29
the probability that Donald Trump will
00:07:30
be our next president and so one place
00:07:33
you may go to get information about this
00:07:35
you may go to 538 nate silver's website
00:07:37
and one thing that nate silver is doing
00:07:39
this election season that he hasn't done
00:07:41
in previous election seasons is he's
00:07:42
providing multiple models so in the past
00:07:46
he told you the probability that Hillary
00:07:47
Clinton would win is seventy-seven
00:07:48
percent now he's telling you if you
00:07:50
believe this model it's seventy two
00:07:52
percent if you believe this model it's
00:07:53
eighty-four percent and sometimes
00:07:55
there's really quite a deviation between
00:07:57
these two models well what are these two
00:08:00
models well basically they're making
00:08:03
different assumptions about the world
00:08:04
and your decision as to which model you
00:08:07
listen to is really a decision about
00:08:09
what what you believe the data
00:08:11
generating process to be what do you
00:08:13
believe the world to look like and in
00:08:14
particular there's one model that says
00:08:16
we should only listen to the pulse we
00:08:18
should only listen to what people are
00:08:19
saying right now and there's another
00:08:20
model that says actually there are lots
00:08:23
of predictors economic indicators for
00:08:27
instance like that historically have
00:08:28
been very predictive of election
00:08:30
outcomes and so we should listen to
00:08:31
those as well as your decision about
00:08:33
which model to listen to her how to
00:08:35
balance these two pieces of information
00:08:37
basically comes down to your belief
00:08:38
about whether this election season is
00:08:41
totally different from the past in which
00:08:43
case you should only listen to the polls
00:08:45
or if you believe that this is just
00:08:46
another draw and some stable
00:08:48
distribution that is similar to
00:08:51
everything else that's come before and
00:08:53
so generally what I'm interested in
00:08:56
is how can we frame questions what types
00:08:58
of information can give people to make
00:09:01
them think that this moment the present
00:09:04
is just like the past and the past is a
00:09:07
good predictor of the present and what
00:09:09
types of information how can we frame
00:09:10
questions to get people to think
00:09:12
actually know the process is not
00:09:14
stationary at all this moment is unique
00:09:16
in time great that's fascinating thanks
00:09:20
very much for joining us a time yeah my
00:09:21
pleasure thank you
00:09:38
you

Episode Highlights

  • The Impact of the Count
    How the count influences umpire calls, expanding the strike zone based on the situation.
    “The count dramatically expands the strike zone.”
    @ 01m 50s
    October 13, 2016
  • Umpires and Decision-Making
    Exploring how umpires use data to make calls in baseball, revealing biases in their decisions.
    “Umpires trade off accuracy for bias.”
    @ 02m 57s
    October 13, 2016
  • Predicting Elections with Data
    Using machine models to make predictions about election outcomes, emphasizing the importance of data interpretation.
    “This moment is unique in time.”
    @ 09m 16s
    October 13, 2016

Episode Quotes

  • The count dramatically expands the strike zone.
    Baseball, Bias and Decision-Making
  • Umpires trade off accuracy for bias.
    Baseball, Bias and Decision-Making
  • This moment is unique in time.
    Baseball, Bias and Decision-Making

Key Moments

  • Umpire Calls00:07
  • Expert Decision-Making00:17
  • Bias vs. Accuracy02:57
  • Election Predictions07:20

Words per Minute Over Time

Vibes Breakdown

Related Episodes

The Math Behind Sports Rankings and Golf Analytics
May 07, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:08:01
The Math Behind Sports Rankings and Golf Analytics
The Many Meanings of Baseball: History, Data, and Fan Experience
April 02, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
56:14
The Many Meanings of Baseball: History, Data, and Fan Experience
How AI and Analytics Are Changing Sports Performance and Strategy
June 04, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
09:04
How AI and Analytics Are Changing Sports Performance and Strategy
How Analytics and New Rules Are Changing Baseball
March 05, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:01:15
How Analytics and New Rules Are Changing Baseball
How Analytics Changed Baseball’s Strategy, Storytelling, and Fan Experience
October 23, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
57:58
How Analytics Changed Baseball’s Strategy, Storytelling, and Fan Experience
Inside College Football’s Data-Driven Evolution and Decision-Making
January 22, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:10:36
Inside College Football’s Data-Driven Evolution and Decision-Making
Billy Wagner Hall of Fame, MLB Pitching Trends & College Football Week Zero
August 27, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:07:31
Billy Wagner Hall of Fame, MLB Pitching Trends & College Football Week Zero
How Sam Fuld Is Shaping Phillies Strategy Through Analytics
April 16, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
55:37
How Sam Fuld Is Shaping Phillies Strategy Through Analytics
Breaking Barriers in Sports Performance: Technology, Analytics, and the Race for a Sub-4-Minute Mile
October 06, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:30:53
Breaking Barriers in Sports Performance: Technology, Analytics, and the Race for a Sub-4-Minute Mile
NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
December 01, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:00:01
NBA Shockwaves, Why the Chiefs Still Rank No.1, and the Power of Data
Baseball’s Hall of Fame Debate Is Changing
May 27, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
59:35
Baseball’s Hall of Fame Debate Is Changing
NBA Playoff Analytics, Victor Wembanyama, and the Hot Hand Debate
May 20, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:03:03
NBA Playoff Analytics, Victor Wembanyama, and the Hot Hand Debate