All Case Studies

RFM segmentation turns two years of CPG purchase data into a 40% return on ad spend lift

Challenge
82,000 customers, all receiving the same message. No visibility into segment-level value, no mechanism to prioritize spend — and a paid social channel delivering the weakest return in the mix.
Solution
An RFM segmentation engine on Databricks classified customers weekly by recency, frequency, and spend — turning a static CRM into a living behavioral layer that directly informed channel and campaign strategy.
Impact
ROAS up 40% in three months. Budget reallocated from paid social to targeted email. Loyalist retention, subscription adoption, and test-and-learn experimentation all systematized for ongoing improvement.

Introduction

In e-commerce, acquiring a customer is only half the equation — the real margin lives in understanding which customers are worth keeping. Yet most growing CPG brands apply the same messaging to every buyer, from first-time purchasers to their most loyal advocates. Research from HubSpot and Omnisend shows that targeted and personalized emails account for 58% of total e-commerce revenue — and that email consistently delivers 4–9× the return of paid advertising when it's segment-driven rather than broadcast.

For this CPG e-commerce startup, the challenge wasn't a lack of data — it was a lack of signal. With 82,000 customers and two years of purchase history, the ingredients for a smarter marketing program existed. What was missing was a way to turn raw transaction records into actionable customer intelligence: a segmentation engine that could tell the team which customers to invest in, how much, and through which channel.

Key Challenges

The client's marketing program treated all 82,000 customers identically — regardless of purchase frequency, recency, or spend. Without segment-level visibility, budget flowed equally to one-time buyers and high-LTV loyalists alike. Paid social, the team's primary acquisition channel, was generating the lowest return on ad spend in their mix. The opportunity cost of undifferentiated marketing was real but invisible.

One-Size-Fits-All Marketing

All 82,000 customers received identical messaging regardless of purchase history or lifetime value — wasting spend on low-intent audiences and under-investing in high-value ones.

Weak Paid Social Returns

Paid social received a disproportionate share of the budget despite generating the lowest ROAS in the channel mix — with no segment-level data to justify the allocation.

No Customer Value Visibility

The CRM held transaction records but no behavioral layer. The team had no way to distinguish high-LTV loyalists from one-time buyers or window-shoppers who had never converted.

Static, Unactionable CRM

Customer data was a snapshot, not a system. Without continuous segment updates, any segmentation work would decay the moment it was completed.

Solution Components

We designed and deployed an RFM segmentation engine on Databricks that classified 82,000 customers across recency, frequency, and monetary value using K-means clustering. The model was productionized to refresh segment assignments weekly, turning a static CRM into a living behavioral layer. Insights directly informed channel allocation, email targeting, subscription strategy, and an ongoing test-and-learn framework.

RFM Segmentation Engine

K-means clustering across recency, frequency, and monetary value produced an overall score (0–9) mapped to three actionable customer segments and nine granular sub-segments.

Weekly Automated Refresh

The segmentation model deployed on Databricks refreshes customer segment assignments weekly — keeping the CRM continuously current without manual intervention.

Segment-Driven Campaigns

Insights operationalized into targeted email programs, subscription offers by product affinity, and a test-and-learn framework that systematized experimentation across all segments.

Impact

Within three months of implementation, return on ad spend increased by 40%. Budget shifted from low-performing paid social toward high-ROI targeted email, driving measurable gains in Loyalist retention and subscription adoption. The weekly-updated segmentation engine became the operational backbone of a more data-driven marketing program — one built to improve continuously as the customer base grows.

82K
Customer buying histories analyzed and segmented
+40%
ROAS increase within three months of implementation
35%
Revenue from Loyalists — just 13% of the customer base
9
Granular sub-segments enabling precision campaign targeting

Our Process

01
STEP 01

Data Extraction & Cohort Setup

Two years of purchase history organized into customer cohorts by first purchase date, creating the longitudinal baseline for RFM analysis.

02
STEP 02

RFM Clustering

K-means clustering (k=4) applied independently to recency, frequency, and monetary value. Scores merged by customer ID and summed to produce an overall_score (0–9).

03
STEP 03

Segment Profiling & Insights

Four segments analyzed across cohort migration rates, acquisition channel ROAS, and price sensitivity — surfacing the behavioral patterns driving each group.

04
STEP 04

Productionization on Databricks

Segmentation model deployed with a weekly refresh cadence, automatically appending RFM scores and segment labels to all 82,000 customers in the CRM.

Tech Stack

Databricks Databricks
Tableau Tableau
Python Python

Have a similar challenge?

Let's discuss how AI can transform your workflows.

Book a Call