Understanding What a Customer Data Platform Needs to Be
Modern-day marketers try to achieve holistic personalization through all conceivable channels in order to stand out among countless marketing messages hitting targeted individuals every day, if not every hour. If the message is not clearly about the target recipient, it will be quickly dismissed.
So, how can marketers achieve such an advanced level of personalization? First, we have to figure out who each target individual is, which requires data collection: what they clicked, rejected, browsed, purchased, returned, repeated, recommended, look like, complained about, etc. Pretty much every breath they take, every move they make (without being creepy). Let’s say that you achieved that level of data collection. Will it be enough?
Enter “Customer-360,” or “360-degree View of a Customer,” or “Customer-Centric Portrait,” or “Single View of a Customer.” You get the idea. Collected data must be consolidated around each individual to get a glimpse — never the whole picture — of who the targeted individual is.
You may say, “That’s cool, we just procured technology (or a vendor) that does all that.” Considering there is no CRM database or CDP (Customer Data Platform) company that does not say one of the terms I listed above, buyers of technology often buy into the marketing pitch.
Unfortunately, the 360-degree view of a customer is just a good start in this game, and a prerequisite. Not the end goal of any marketing effort. The goal of any data project should never be just putting all available data in one place. It must support great many complex and laborious functions during the course of planning, analysis, modeling, targeting, messaging, campaigning, and attribution.
So, for the interest of marketers, allow me to share the essentials of what a CDP needs to be and do, and what the common elements of useful marketing databases are.
A CDP Must Cover Omnichannel Sources
By definition, a CDP must support all touchpoints in an omnichannel marketing environment. No modern consumer lingers around just in one channel. The holistic view cannot be achieved by just looking at their past transaction history, either (even though the past purchase behavior still remains the most powerful predictor of future behavior).
Nor do marketers have time to wait until someone buys something through a particular channel for them to take actions. All movements and indicators — as much as possible — through every conceivable channel should be included in a CDP.
Yes, some data evaporates faster than others — such as browsing history — but we are talking about a game of inches here. Besides, data atrophy can be delayed with proper use of modeling techniques.
Beware of vendors who want to stay in their comfort zone in terms of channels. No buyer is just an online or an offline person.
Data Must Be Connected on an Individual Level
Since buyers go through all kinds of online and offline channels during the course of their journey, collected data must be stitched together to reveal their true nature. Unfortunately, in this channel-centric world, characteristics of collected data are vastly different depending on sources.
Privacy concerns and regulations regarding Personally Identifiable Information (PII) greatly vary among channels. Even if PII is allowed to be collected, there may not be any common match key, such as address, email, phone number, cookie ID, device ID, etc.
There are third-party vendors who specialize in such data weaving work. But remember that no vendor is good with all types of data. You may have to procure different techniques depending on available channel data. I’ve seen cases where great technology companies that specialized in online data were clueless about “soft-match” techniques used by direct marketers for ages.
Remember, without accurate and consistent individual ID system, one cannot even start building a true "Customer-360" view.
Data Must Be Clean and Reliable
You may think that I am stating the obvious, but you must assume that most data sources are dirty. There is no pristine dataset without a serious amount of data refinement work. And when I say dirty, I mean that databases are filled with inaccurate, inconsistent, uncategorized, and unstructured data. To be useful, data must be properly corrected, purged, standardized, and categorized.
Even simple time-stamps could be immensely inconsistent. What are date-time formats, and what time zones are they in? Dollars aren’t just dollars either. What are net price, tax, shipping, discount, coupon, and paid amounts? No, the breakdown doesn’t have to be as precise as for an accounting system, but how would you identify habitual discount seekers without dissecting the data up front?
When it comes to free-form data, things get even more complicated. Let’s just say that most non-numeric data are not that useful without proper categorization, through strict rules along with text mining. And such work should all be done up front. If you don’t, you are simply deferring more tedious work to poor analysts, or worse, to the end-users.
Beware of vendors who think that loading the raw data onto some table is good enough. It never is, unless the goal is to hoard data.
Data Must Be Up-to-Date
“Real-time update” is one of the most abused word in this business. And I don’t casually recommend it, unless decisions must be made in real-time. Why? Because, generally speaking, more frequent updates mean higher maintenance cost.
Nevertheless, real-time update is a must, if we are getting into fully automated real-time personalization. It is entirely possible to rely on trigger data for reactive personalization outside the realm of the CDP environment, but such patchwork will lead to regrets most of the time. For one, how would you figure out what elements really worked?
Even if a database is not updated in real-time, most source data must remain as fresh as they can be. For instance, it is generally not recommended to append third-party demographic data real-time (except for “hot-line” data, of course). But that doesn’t mean that you can just use old data indefinitely.
When it comes to behavioral data, time really is of an essence. Click data must be updated at least daily, if not real-time. Transaction data may be updated weekly, but don’t go over a month without updating the base, as even simple measurements like “Days since last purchase” can be way off. You all know the importance of the good old recency factor in any metrics.
Data Must Be Analytics-Ready
Just because the data in question are clean and error-free, that doesn’t mean that they are ready for advanced analytics. Data must be carefully summarized onto an individual level, in order to convert “event level information” into “descriptors of individuals.” Presence of summary variables is a good indicator of true Customer-360.
You may have all the click, view, and conversion data, but those are all descriptors of events, not people. For personalization, you need know individual level affinities (you may call them “personas”). For planning and messaging, you may need to group target individuals into segments or cohorts. All those analytics run much faster and more effectively with analytics-ready data.
If not, even simple modeling or clustering work may take a very long time, even with a decent data platform in place. It is routinely quoted that over 80% of analysts’ time goes into data preparation work — how about cutting that down to zero?
Most modern toolsets come with some analytics functions, such as KPI dashboards, basic queries, and even segmentation and modeling. However, for advanced level targeting and messaging, built-in tools may not be enough. You must ask how the system would support professional statisticians with data extraction, sampling, and scoring (on the backend). Don’t forget that most analytics work fails before or after the modeling steps. And when any meltdown happens, do not habitually blame the analysts, but dig deeper into the CDP ecosystem.
Also, remember that even automated modeling tools work much better with refined data on a proper level (i.e., individual level data for individual level modeling).
CDP Must Be Campaign-Ready
For campaign execution, selected data may have to leave the CDP environment. Sometimes data may end up in a totally different system. A CDP must never be the bottleneck in data extraction and exchange. But in many cases, it is.
Beware of technology providers that only allow built-in campaign toolsets for campaign execution. You never know what new channels or technologies will spring up in the future. While at it, check how many different data exchange protocols are supported. Data going out is as important as data coming in.
CDP Must Support Omnichannel Attribution
Speaking of data coming in and out, CDPs must be able to collect campaign result data seamlessly, from all employed channels. The very definition of “closed-loop” marketing is that we must continuously learn from past endeavors and improve effectiveness of targeting, messaging, and channel usage.
Omnichannel attribution is simply not possible without data coming from all marketing channels. And if you do not finish the backend analyses and attribution, how would you know what really worked?
The sad reality is that a great majority of marketers fly blind, even with a so-called CDP of their own. If I may be harsh here, you are not a database marketer if you are not measuring the results properly. A CDP must make complex backend reporting and attribution easier, not harder.
For a database system to be called a CDP, it must satisfy most — if not all — of these requirements. It may be daunting for some to read through this, but doing your homework in advance will make it easier for you in the long run.
And one last thing: Do not work with any technology providers that are stingy about custom modifications. Your business is unique, and you will have to tweak some features to satisfy your unique needs. I call that the “last-mile” service. Most data projects that are labeled as failures ended up there due to a lack of custom fitting.
Conversely, what we call “good” service providers are the ones who are really good at that last-mile service. Unless you are comfortable with a one-size-fits-all, pre-made — but cheaper — toolset, always insist on customizable solutions.
You didn’t think that this whole omnichannel marketing was that simple, did you?
Related story: Setting A CDP (Or Any Tech) Up for Success
Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at firstname.lastname@example.org.