Search Captions & Ask AI

Wharton Professors Eric Bradlow and Peter Fader on "The Data Dilemma"

March 19, 2009 / 04:58

This episode discusses data minimization in marketing, the importance of aggregate data, and the challenges companies face with large datasets.

The guest emphasizes the need for companies to focus on essential data rather than retaining everything. They mention a talk at MIT that inspired their approach to using minimal data for effective marketing strategies.

They argue that companies can achieve accurate models using simple histograms instead of detailed customer data. This method reduces software and data requirements, making it easier for companies to analyze customer behavior.

The conversation highlights the risks and costs associated with storing excessive data, including information security concerns and the diminishing value of outdated data.

Ultimately, the guest suggests that companies should balance the need for data retention with the practicalities of data management and analysis.

TL;DR

The episode covers data minimization in marketing and the benefits of using aggregate data over individual-level data.

Episode

4:58
00:00:01
this podcast is brought to you by
00:00:04
knowledge at Wharton for more
00:00:05
information please visit knowledge
00:00:08
Horton upenn edu
00:00:17
I think there are really two motivations
00:00:19
the first one was a lot of times we
00:00:23
spend a lot of time in the field of
00:00:24
marketing look at individual level data
00:00:25
so building models that take into
00:00:27
account what you might do what I might
00:00:29
do etc
00:00:31
unfortunately there's a lot of times
00:00:32
that retailers manufacturers people
00:00:34
actually making the decisions don't have
00:00:36
that level of data so wouldn't it be
00:00:38
wonderful to be able to make statements
00:00:40
about who to target with a particular
00:00:41
promotion coupon price cut etc based on
00:00:45
aggregate information and so a more
00:00:47
general way of thinking about that is
00:00:49
what we would call data minimization
00:00:51
meaning even if you had all the
00:00:53
individual level data how much would you
00:00:55
really benefit by it and to be honest
00:00:57
the first time I heard about it was it
00:00:59
was a talk at MIT by another researcher
00:01:01
I'm rarely inspired by other people's
00:01:04
research but I had heard someone else
00:01:05
give a talk on this and while I don't
00:01:07
think their approach was right it
00:01:09
actually made me start to think about
00:01:11
how to actually do inference with
00:01:13
minimal data I'm a cheapskate at heart
00:01:15
so I like doing a formal academic
00:01:18
research like developing elaborate
00:01:20
models but I like to make them very very
00:01:22
practical that means two things one it
00:01:24
means trying to use easy software trying
00:01:27
to use Microsoft Excel instead of fancy
00:01:30
software that you need to write computer
00:01:32
code for and second it means trying to
00:01:34
use as little data as possible so after
00:01:36
building some elaborate models using
00:01:38
very detailed extensive customer level
00:01:40
databases I wanted to answer both of
00:01:43
those questions can we reduce the
00:01:44
software requirements and if a paper
00:01:46
that does that and it's kind of fun and
00:01:48
interesting but can we reduce the data
00:01:50
requirements and the answer is
00:01:51
shockingly yes instead of having very
00:01:54
detailed transactions about what each
00:01:56
customer's doing at each moment in time
00:01:58
if you boil it all down to just a simple
00:02:01
histogram how many people bought from us
00:02:03
once this year twice three times and you
00:02:05
show me a series of those histograms
00:02:07
over time very easy for companies to
00:02:08
collect and store no worries about
00:02:10
privacy you can build the same kinds of
00:02:13
models with the same degree of accuracy
00:02:14
is if you had the original raw data in
00:02:17
the first place
00:02:21
companies are afraid to get rid of the
00:02:23
data for the same reason that we're
00:02:24
unwilling to clean out our addicts
00:02:26
they're thinking that there's going to
00:02:27
be some value that at some point this
00:02:29
piece of data knowing this person's
00:02:31
demographics are knowing that person's
00:02:33
purchase history from 2004 is going to
00:02:36
be of some value and throwing it away
00:02:38
means throwing away assets that's a real
00:02:41
problem for companies to do that turns
00:02:43
out that of course it's a great cost to
00:02:45
them to store the data great risk in
00:02:47
terms of information security there's
00:02:49
also very little value for certain kinds
00:02:51
of measures aren't valuable at all even
00:02:53
when they're fresh and other kinds of
00:02:55
measures really do get dated so it's
00:02:57
important for companies to figure out
00:02:58
what they need to keep what they can get
00:03:00
rid of and to focus on among the
00:03:02
measures that really are timely and
00:03:04
important what to do with them the data
00:03:06
by itself isn't that useful unless you
00:03:09
really know how to draw insights from it
00:03:11
and that sounds a bit like a trite
00:03:12
statement a lot of people saying that
00:03:14
data is that you can't turn it into
00:03:16
knowledge and so on and but it's really
00:03:19
true and you have to know what kinds of
00:03:21
models to build what kind of statistical
00:03:22
assumptions to make what kinds of
00:03:24
equations you need to build around it
00:03:26
and there's a very nice interplay
00:03:27
between having a certain amount of data
00:03:30
and drawing certain kinds of inferences
00:03:32
you can really address one of those
00:03:34
issues without the other and I think
00:03:35
we've managed to strike a nice balance
00:03:37
for the 2 i'm not sure companies today
00:03:39
know how to predict using the data very
00:03:42
well so the problem is they're just
00:03:44
trying to keep everything in sight it
00:03:46
might turn out that they have the right
00:03:47
measures today might turn out that what
00:03:50
they want to do in the future changes
00:03:51
and so even if they had a perfectly
00:03:53
predictive model they're afraid that
00:03:55
general customer behavior will change in
00:03:57
the future therefore the minute they
00:03:59
throw something away it's kind of gone
00:04:00
forever so I think part of that
00:04:02
mentality is probably ok but I think
00:04:05
thing that companies should always way
00:04:07
off against is what is the cost of
00:04:09
keeping all of this massive data and you
00:04:11
say it's not just the size of the server
00:04:12
that someone has to build it's not just
00:04:14
the big staff that someone has to have
00:04:16
to manage all of the data it's building
00:04:20
models on massively large data sets is
00:04:23
problematic it leads to companies having
00:04:25
to hire experts statisticians expert
00:04:27
programmers and it's not that I want to
00:04:29
put myself out of a job I think there is
00:04:31
a lot of value for expert statisticians
00:04:33
but now not only have be a statistician
00:04:35
you have to be a computer scientist to
00:04:38
deal with these massive databases and so
00:04:40
they're essentially looking for people
00:04:42
that have all of these skills when most
00:04:44
of the data they collect really isn't as
00:04:46
useful as they think for more business
00:04:48
news and analysis from knowledge
00:04:50
awarding please visit knowledge wharton
00:04:53
upenn BDU

Episode Highlights

  • Data Minimization in Marketing
    Exploring how companies can benefit from reducing data requirements while maintaining accuracy.
    “Shockingly yes, you can build models with less data.”
    @ 01m 51s
    March 19, 2009
  • The Cost of Keeping Data
    Discussing the risks and costs associated with maintaining large datasets.
    “What is the cost of keeping all of this massive data?”
    @ 04m 09s
    March 19, 2009

Episode Quotes

  • I'm a cheapskate at heart.
    Wharton Professors Eric Bradlow and Peter Fader on "The Data Dilemma"
  • It's important for companies to figure out what they need to keep.
    Wharton Professors Eric Bradlow and Peter Fader on "The Data Dilemma"

Key Moments

  • Cheapskate Approach01:13
  • Data Management02:58
  • Data Insights03:06

Words per Minute Over Time

Vibes Breakdown

Related Episodes

How Data Expertise Helps Firms Create Social Media that Matters
August 25, 2016
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
10:52
How Data Expertise Helps Firms Create Social Media that Matters
Decision-Driven Analytics in the Era of AI
June 25, 2024
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
12:41
Decision-Driven Analytics in the Era of AI
Cost Management in the Digital Age
March 29, 2019
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
21:31
Cost Management in the Digital Age
The Final Frontier: How Entrepreneurs Cracked the Aerospace Industry
November 30, 2016
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
22:11
The Final Frontier: How Entrepreneurs Cracked the Aerospace Industry
Wharton Moneyball Podcast – 10-Year Anniversary Episode
May 23, 2024
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
01:05:14
Wharton Moneyball Podcast – 10-Year Anniversary Episode
What's Behind the Surge of Interest in People Analytics?
April 10, 2015
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
22:49
What's Behind the Surge of Interest in People Analytics?
How To Turn Online Data Into a Pricing Strategy That Works
June 06, 2017
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
09:55
How To Turn Online Data Into a Pricing Strategy That Works
Building Better Recommendation Engines
December 04, 2015
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
13:19
Building Better Recommendation Engines
For the Win: Using Connected Strategies to Gain a Competitive Advantage
May 20, 2019
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
30:41
For the Win: Using Connected Strategies to Gain a Competitive Advantage
How Agentic AI Is Transforming Marketing
January 24, 2026
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
32:19
How Agentic AI Is Transforming Marketing
How Can AI Improve Health Care? – Wharton's Hamsa Bastani and Marissa King | AI in Focus Series
November 10, 2023
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
27:45
How Can AI Improve Health Care? – Wharton's Hamsa Bastani and Marissa King | AI in Focus Series
How Amgen Uses AI & Data Science to Revolutionize Marketing and Biotech Innovation
October 24, 2025
Captions not detected. You can watch the video, but not search it. If you think this is an error, contact support.
27:20
How Amgen Uses AI & Data Science to Revolutionize Marketing and Biotech Innovation