Business Challenge
The client, an e-commerce startup in the Consumer-Packaged Goods (CPG) industry, sought to refine its marketing strategies with the goal of identifying, retaining, and increasing the Lifetime Value (LTV) of its customer base. The client's existing marketing tactics lacked the differentiation needed to target customers based on their purchase history and other relevant characteristics, such as demographic traits. To address this challenge, an RFM (Recency Frequency Monetary Value) customer segmentation analysis was identified as a first step in gaining insights into the customer base in order to drive targeted marketing efforts.
Impact and Outcome
The RFM analysis was conducted using a dataset comprising two years of historical purchase data. This analysis revealed three distinct customer segments and identified one group of prospective customers—those who had visited the e-commerce website, registered, but had not made any purchases:
- Loyalists - This segment, while representing only 13% of the customer base, accounted for a substantial 35% of the total revenue. Loyalists had made their most recent purchase within the last 90 days, exhibited the highest average revenue per customer, and had made more than five purchases since becoming customers.
- Emerging-Loyalists - Comprising 38% of the customer base and contributing 36% of the revenue, Emerging-Loyalists had made their latest purchase within the past year. They ranked second in terms of average revenue per customer and had made more than two purchases since becoming customers.
- One-Timers - This segment, making up 49% of the customer base, contributed 29% of the revenue. One-Timers had not made a purchase in over a year, displayed the lowest average revenue per customer, and had made only one purchase since becoming customers.
- Window-Shoppers - This category included prospective customers who provided an email address but had not made any purchases. They were included in the analysis to gain insights into the characteristics of individuals within the client's CRM who had not converted into customers.
The segmentation algorithm was subsequently productionized, allowing for weekly updates of customer segment assignments. These segments served as the foundation for a more targeted marketing strategy, enabling tailored product offerings and marketing messages for the most loyal customers and incentives to increase purchasing frequency within the lower-value segments. Further details of this high-level approach are provided below.
Key Metrics
82K
Customers' buying histories were analyzed and a customer segmentation algorithm was developed and productionized.
3
Three distinct RFM (Recency, Frequency, Monetary) customer segments were identified, along with nine more granular subsegments.
+40%
Return on Ad Spend (ROAS) increase stemming from insights and actions emerging from the customer segmentation analysis.
Approach
The organization's primary objective was to enhance every stage of the customer journey throughout the entire customer conversion lifecycle. During this phase, the particular emphasis was on optimizing customers who were positioned at the final stages of the "conversion funnel," typically considered the bottom of the funnel.
Conversion Funnel Optimization | Action
Through the RFM analysis, the objective was to gain insights into the worth of each customer segment based on recency, frequency, and monetary value. These insights, along with the segmentation algorithm created, were intended to serve as inputs for the feedback loop. This loop, in turn, would inform and guide strategies in the upper and mid-funnel stages of the marketing and customer engagement process.
RFM Algorithm
Initially, each customer was assigned to a year/month cohort based on their first purchase date, signifying when they became a customer. Once assigned to a cohort, the following high-level steps were executed within the context of their respective cohort to classify customers into RFM segments.
- Recency: To calculate recency, we identified the most recent purchase date for each customer and determined the number of days of inactivity. K-means clustering was then applied to categorize customers into distinct clusters based solely on their recency, defined by the number of inactive days. After considering the Elbow Method and business requirements, we opted for four clusters. K-means assigns cluster numbers, but these numbers lack inherent order, making it impossible to label cluster 0 as the worst and cluster 4 as the best. To address this, we introduced a new feature called "recency_clusters." In this updated scheme, cluster 0 represents the group with the most inactive days, resulting in the lowest recency score, while cluster 4 represents the group with the least inactive days, resulting in the highest recency score. This reordering enhances the interpretation of clusters.
- Frequency: A similar approach was employed to establish frequency clusters. Initially, we calculated the total number of orders for each customer. Subsequently, we used K-means clustering to categorize customers based solely on their frequency of purchases, utilizing four clusters (i.e., k=4), akin to the recency calculation. Once again, we introduced a feature known as "frequency_clusters." This feature designates clients with the highest frequency to cluster 3 and those with the lowest frequency to cluster 0.
- Monetary Value: The process for monetary values paralleled the approaches used for recency and frequency. We calculated the total revenue by customer and employed K-means clustering with k=4 to group customers. Here again, we introduced a more intuitive feature, "revenue_clusters," in which high-revenue customers were assigned to cluster 3, and low-revenue customers were assigned to cluster 0.
- Overall Score: The final step involves deriving an overall score. We begin by merging the three previously generated datasets—recency, frequency, and monetary value by customer—using customer IDs. This results in the highest-value customers being assigned to cluster 3 for each of the three RFM measures, while the lowest-value customers receive a 0 for each measure. We then calculate the "overall_score" feature by simply summing the values for the three measures. Consequently, the most valuable customers are assigned an overall score of 9, while the lowest-value customers receive an overall score of 0.
Customer Segments & Profiling
The resulting RFM segments (0 through 9) were bucketed into more intuitive, easier-to-communicate labels:
- 0 to 3: One-Timers
- 4 to 6: Emerging Loyalists
- 7+: Loyalists
The production algorithm appended both the RFM segments (i.e., 0-9 RFM scores) and corresponding labels to the customer-base on a recurring, weekly basis.
The diagram below provides an overview of key purchase and income characteristics of the corresponding segments along with "Window-Shoppers" which are those prospective customers that registered on the sight but have never made a purchase.
Cohorts Over Time
From the visualization below, we can see the immediate drop-off for the One-Timer segment after the initial purchase vs. the continued purchases made by each of the Emerging Loyalists and, even more so, Loyalists segments. (Select a cohort from the drop-down to see)
Segment Analysis & Insights
The following provide a sampling of some of the additional analysis, findings, and actions coming out of the analysis that drove business impact.
Customer Segment Migration
After segmenting our customer base, a crucial question we needed to address was how quickly customers from different segments moved between segments, either from lower to higher-value segments or vice versa. Gaining a deeper insight into this migration rate and conducting a comprehensive analysis to uncover the reasons behind these shifts would be instrumental in shaping our strategy and tactics. This, in turn, would allow us to drive customer purchases and encourage them to migrate towards higher value segments or keep high value segments engaged.
Methodology
Initially, we evaluated monthly cohorts by analyzing their activity during the first 4 months (e.g., the August 2020 cohort's segment was determined by their activity from 8/1/2020 to 11/1/2020). Subsequently, we re-evaluated all these segments by considering activity post the 4 months observed up through the end of 2021 to determine if cohorts transitioned from a lower to a higher cohort or vice versa. In essence, we assessed cohort "migration."
The diagram below shows the percentage of customers migrating across cohorts.
Key
OT = One-Timer segment
EL = Emerging Loyalist segment
L = Loyalist Segment
(For example, OT to OT = customers that stayed in the One-Timer segment across the analysis period and OT to EL = customers who migrated from the One-Timer to the higher-value Emerging Loyalist segment etc.)
Customer Migration Summary
The diagram below provides a summarized view of the migration. Some of the key points:
- One-Timers are the most static of the three segments with ~80% staying within their segment across the analysis window
- Emerging Loyalists and Loyalists were more fluid in terms of migration up or down the value chain
Marketing Channel by Segment
Next, we examined the segments from the perspective of the acquisition channels they were associated with. We evaluated both paid and non-paid channels.
Below we can see KPIs such as Average Order Value (AOV), Lifetime Value (LTV), and units per order purchased by customers acquired through each of the channels.
- Direct Traffic: This channel primarily attracted repeat visitors or new visitors who were enticed by recurring TV informercials. Notably, website traffic saw significant spikes during the periods when TV ads were broadcast. Visitors coming through direct traffic were already familiar with the brand.
- Organic Search: Organic search traffic consisted of customers who generally had less brand loyalty and found the website through non-branded search terms.
- Paid Search: The company's paid search strategy primarily focused on branded search terms, emphasizing their own branded products.
- Paid Social: Among the company's paid media channels, paid social proved to be the least effective, with the lowest Return on Ad Spend (ROAS) and revenue generated.
- Email: Before the segmentation initiative, email marketing was underutilized, characterized by sending generic, undifferentiated emails and offers to the customer base.
Segment Distribution by Channel
As expected, the paid social channel, which had the lowest Return on Ad Spend (ROAS), primarily drew in a higher proportion of One-Timer segment customers and a smaller proportion of Loyalists. Conversely, email exhibited the opposite trend. This finding underscores the initial opportunity to improve the marketing strategy by transitioning to a more targeted approach with email, ultimately striving for higher ROAS.
Price Sensitivity
The correlation matrix below illustrates the relationship between the number of orders and total discounts per day within each of the three segments. Notably, the One-Timers segment exhibits the strongest correlation between total orders and discounts, with a coefficient of 0.72, while the Loyalists segment demonstrates the weakest correlation among the three.
Actions and Impact
The customer segmentation exercise served as the foundation and catalyst for the following marketing initiatives:
Test-and-Learn Approach: The segmentation exercise was the first step in establishing a more methodical "test-and-learn" approach to marketing optimization. Over time, various combinations of marketing messages, products, and creative elements were tested across each of the segments to determine experimentally what resonated most with each segment.
Targeted Marketing: One of the immediate next steps following the analysis was the shift of the marketing budget to provide greater focus on targeted email campaigns. A dedicated email marketing specialist and in-house designer were hired to custom target the various segments. Additionally, the paid search strategy was tailored to complement and work in lockstep with the revised customer-segment-focused approach.
Customer Acquisition: Third-party customer enrichment data was appended to RFM segments to better understand Loyalist and Emerging Loyalist customer profiles. This provided marketing with richer insights into the demographic and psychographic characteristics of the organization's most valuable customers.
Retention and Loyalty Programs: Product subscriptions, the most profitable product offering for the organization, were further tailored and, as a result, increased based on subsequent analysis conducted on product affinity by segment.
Over the first three months post-implementation, there was a 40% increase in ROAS. Overall, the customer segmentation exercise, coupled with the insights and action items that resulted, served as a cornerstone for these strategic marketing initiatives, fostering a data-driven and customer-centric approach to marketing optimization.