“It is a capital mistake to theorize before one has data.”Sherlock Holmes
Data Management Platforms are extraordinarily powerful tools.
They allow you to rapidly collect, analyze, and make use of data from various sources. But before we can really start talking about DMPs, we need to talk about data.
Data has been a cornerstone of programmatic advertising since its very inception.
In fact, it’s really a prerequisite for programmatic. It would be almost impossible for advertisers to bid on a given impression if they didn’t have any data by which to judge its relative value.
One could say without
The incredible importance of data
Over the last decade, data has only been getting more and more critical to all manners of business.
Companies are using data to understand the needs of their customers better. Marketers are working to understand their respective markets. And advertisers are using data to determine exactly who should receive which ad, when, and in what form.
One of the primary issues that companies have been grappling with nowadays is actually too much data. They have so much data from so many sources that it has
This enormous surplus of data is the main reason that many speak of “Big Data.”
Since programmatic advertising itself is built on data, everyone involved in the industry must have at least a basic understanding of them and how they are categorized.
The goal of this article is the succinct but precise presentation of data as they relate to the programmatic advertising industry. But more specifically, how a particular tool, the Data Management Platform (DMP), has become practically indispensable.
First we’ll introduce (or review) some essential aspects regarding how data are classified and described.
The kinds of data
There are three primary data types. These are declared, observed, and inferred. While their names might give you a general idea of each, we’ll define them anyway and provide some quick examples.
Declared data are data that the customer himself declares, hence the name.
This data is generally very reliable and might include things like age, sex, geographic location, shipping address, etc. These are generally things that a customer might enter himself when he registers on a site or makes a purchase.
Declared data is often the most useful and valuable of all data because it removes much of the guesswork.
Observed data are data about the customer that have been directly, well, observed. If a customer frequently makes it to the last stage of buying a specific video game, we have observed that this is a product in which he is keenly interested.
Observed data is often merely tracking the way that the user navigates around the page, what he clicks on, where he stops and what he actually reads. This includes the things he looks at buying and the subjects on which he tends to spend a lot of time reading.
We can observe that a user is very interested in the news. We can observe that there are a few subjects in which he is interested. We can observe that he is interested in car accessories.
Importantly, we can also observe what kinds of media he engages with most willingly. This can be important in deciding what kind of ads would be most effective with this particular user.
Inferred data is data that can be inferred based on observed and declared information. For example, a particular user has declared himself a man around the age of 40. He frequently browses car accessories, buys women’s jewelry, and children’s toys.
Based on his age, sex, browsing, and purchase habits, we can determine a few things about him with relative certainty.
He is likely married, has kids, and is either a car owner or enthusiast.
While knowing what products he is looking at a given time (observed data) might allow for retargeting, this inferred data (in combination with declared) allows for new products to be selected for him and suggested.
There’s no party like a data party
Now, those are the different kinds of data, but one can (and often must) sort the data by their provenance. That is to say, by who collected them.
There are three more categories for this kind of data: first-party, second-party, and third-party.
What do these mean?
First-party data are data that you gather yourself. That is to say, data that comes from a source that you control directly.
If the data are entered into a form on your site, then this is first-party data. If it comes from your cookie, it’s first-party data. If you gathered the data on your own in any way, then you’re dealing with certified first-party data.
Second-party data is data that is gathered by partners. It’s the partner’s own first-party data made available to you.
Any company with whom you have a direct link or deal to share or exchange data is going to be providing you with second-party data. But it’s only second-party data if it’s their first-party data, that is to say, that they collected it themselves.
An example would be an online computer hardware store buying data collected by, say, a popular online tech blog. The blog collected this data themselves, and they decided to make it available to select partners directly.
Third-party data is data that comes from a source that you’re not dealing with directly. Generally, one would purchase third-party data from a company that specializes in the collection, preparation, and sale of data.
The significant difference between second-party and third-party data is that the seller of third-party data did not collect them himself. Rather, he bought them from someone else and then resells them.
These are the three categories of data sorted by provenance.
Advertisers and marketers generally consider first-party data to be the best of the three. This is because it is data on your very own customers. Therefore it is the most useful insofar as achieving a better understanding of the people that are already engaging with you.
Further, it is also useful in determining the characteristics and patterns of people that might engage with your site. With these data, you can build accurate profiles of your average user.
“Where is the wisdom we have lost in knowledge?T. S. Eliot
Where is the knowledge we have lost in information?”
In this short passage from T.S. Eliot’s Choruses from “The Rock”, he bemoans what he sees as one of the great problems of modernity.
That is, the loss of wisdom in the quest to merely know and the loss of that knowing in the quest to simply record.
While T. S. Eliot might be yearning for simpler times, he does, perhaps inadvertently, point out a major problem of the information age: knowledge lost in a sea of information.
If we want to recover that knowledge and eventually become wise, then we need to extract it from the data. And one of the key roadblocks between many companies and that knowledge is data siloing.
Data siloing is a massive issue for many companies. It is often the primary obstacle preventing companies from really gaining insight from all the data they have collected.
Perhaps the worst thing about this problem is that it only gets worse the longer it goes unaddressed.
So what is data siloing?
Data siloing is when your data exist in separate “silos.”
That is to say, they are disconnected and difficult to use or access concurrently. Say you have a lot of first-party data from a couple of your online properties, but you store the data in different databases for each of your sites.
Not only are they different databases, but they use different Database Management Systems.
In this case you’re committing the mortal sin of data siloing.
Now imagine that each team only has access to the data pertaining to the property for which they are responsible.
You’re probably starting to get an idea of what data siloing is.
The data for each of these sites are “siloed” and only realistically usable by the team directly responsible for each property.
But say you want to learn what attributes your customers have in common across all your major properties.
To determine this, you’d need to combine, or at least synthesize to some extent, all of the data from all of your disparate datasets.
What’s this look like from an advertiser’s perspective?
From the perspective of an advertiser, siloing can quickly become a major problem.
Imagine, for example, that you’ve been collecting a large amount of declared first-party data (names, ages, email addresses, maybe interests, etc.), three of your partners are sharing their observed second-part data, and you’re buying third-party data from a supplier.
Now you have several datasets that you need to get to work together. You need to take the data out of their respective silos and mesh them all together to find trends that exist across the datasets. Trends that are likely only visible when the datasets are looked at together.
The unification of data
With all of this diverse, dispersed data stored in different silos, it can be challenging to identify trends and profiles across datasets, or even to get a good overview.
In most cases, the more data, the better. So unifying and looking at combined datasets can be extremely useful and lead to important discoveries, but a lot of work has to occur before this can happen.
One of the significant advantages that a Data Management Platform can offer is the ability to synthesize all of your first, second, and third-party data.
It will do much of this work for you.
The necessity of the DMP for Programmatic Advertising
Data Management Platforms are incredibly complicated pieces of software that perform a myriad of tasks, all with the goal of spitting out useful data. Data that advertisers can use to increase their conversion rates and better attract and retain customers.
At practically every stage of the programmatic buy process is data needed. When a bid request comes along, a DSP, for example, can’t return an informed bid without the necessary data.
Using a DMP
To get a better handle on their data, most advertisers take advantage of a Data Management Platform. But this name can be a bit deceiving. In reality, DMPs do much more than just manage data.
Most DMPs will handle your first-party data collection themselves so that the data is immediately available for use within the platform. They can generally also incorporate data from second-party sources with relative ease.
For third-party data, the DMP provides access to a marketplace of data that can be bought and synthesized with the first and second-party data that you already have.
Information is not knowledge.Albert Einstein
Synthesis and Analysis
Having data isn’t in and of itself useful.
It simply puts you in a position to learn something useful. To do that you need to synthesize your various datasets and analyze them to convert your raw information into actionable knowledge.
This is one of the hallmark functions of a DMP: the ability not just to collect, but to synthesize all of your data into one usable mass.
Here, the collected data can be integrated and analyzed to produce insights that you can use to improve your advertising campaigns.
Applying what you learn
How are data used to improve your campaigns?
One common way of doing this is by integrating your DMP with your Demand-Side Platform. This way, the data that you have collected and analyzed with your DMP can be immediately put to use on your programmatic ad buying campaigns.
By integrating the your DMP with your DSP, the latter can begin improving its targeting thanks to the data provided by the former.
What are some capabilities that DMPs can provide for advertisers?
DMPs can provide all kinds of insights regarding customer behavior, preferences, and demographics.
Regarding advertising, however, it allows for two particularly useful things: retargeting and the creation of lookalike audiences.
One of the things that DMPs can help with is retargeting. A good DMP with a large enough amount of data input can identify users that could be receptive to retargeting.
Retargeting is simply the targeting of users that have demonstrated that they are particularly interested in a service or product. Maybe they’re users that made it to the purchase page of a given product.
Or perhaps they’re users that have purchased a given service already.
In the first case, an advertiser might want to show them the viewed product or a similar, slightly cheaper one several times.
This is a very high-value user as he has already almost converted once; the ability to target him specifically could be very profitable.
Likewise, targeting a user that has already purchased a service could be logical if the service he bought was a company’s entry-level offering.
Retargeting him and serving ads extolling the features of your more premium services might encourage him to upgrade.
Especially if he is retargeted near the end of his annual contract.
Another key feature that DMPs provide is the ability to create lookalike audiences.
What is a lookalike audience?
A lookalike audience is an audience generated by the DMP that features certain characteristic similarities to another audience.
Here’s what that means in practice:
Say you have an audience of subscribers. Maybe you’re wondering what all of your customers have in common and if that could help you seek out new ones. This is when you’d use a lookalike audience.
You could take all of the data that you have collected on your current audience and generate a lookalike audience with your DMP.
This lookalike audience could be drawn from, say, the pool of people that had visited your site over the last month.
The lookalike audience would then be the individuals from your other datasets that most closely resemble your current customer audience.
This audience would then be uniquely qualified for targeted advertising.
Lastly, DMPs provide powerful analytical tools that can help you determine which strategies are and aren’t working.
That’s a wrap
By now, you have a pretty good idea of the different kinds of data and how data is collected and applied to programmatic advertising.
So hopefully next time you run across a report about building a lookalike audience using declared first-party data, you’ll know what’s going on!
With every passing year, the collection, analysis, and utilization of data play ever more important roles online advertising.
As advanced tracking produces increasingly specific data points, the management and parsing of data are only going to become progressively more complicated.
This is why platforms like DMPs are likely to not only stick around but become significantly more feature-rich in the future.