- Data Science
- Data Vizualisation
- Data Science Use Cases
- Marketing Multi-Channel Attribution model based on Sales Funnel with R
This is the last post in the series of articles about using Multi-Channel Attribution in marketing. In previous two articles (part 1 and part 2), we’ve reviewed a simple and powerful approach based on Markov chains that allows you to effectively attribute marketing channels.
In this article, we will review another fascinating approach that marries heuristic and probabilistic methods. Again, the core idea is straightforward and effective.
- Awareness – the customer becomes aware of the existence of a product or service (“I didn’t know there was an app for that”),
- Interest – actively expressing an interest in a product group (“I like how your app does X”),
- Desire – aspiring to a particular brand or product (“Think I might buy a yearly membership”),
- Action – taking the next step towards purchasing the chosen product (“Where do I enter payment details?”).
For an e-commerce site, we can come up with one or more conditions (events/actions) that serve as an evidence of passing each step of a Sales Funnel.
For some extra information about Sales Funnel, you can take a look at my (rather ugly) approach of Sales Funnel visualization with R.
Companies, naturally, lose some share of visitors on each following step of a Sales Funnel as it gets narrower. That’s why it looks like a string of bottlenecks. We can calculate a probability of transition from the previous step to the next one based on recorded history of transitions. On the other hand, customer journeys are sequences of sessions (visits) and these sessions are attributed to different marketing channels.
Therefore, we can link marketing channels with a probability of a customer passing through each step of a Sales Funnel. And here goes the core idea of the concept. The probability of moving through each “bottleneck” represents the value of the marketing channel which leads a customer through it. The higher probability of passing a “neck”, the lower the value of a channel that provided the transition. And vice versa, the lower probability, the higher value of a marketing channel in question.
Let’s study the concept with the following example. First off, we’ll define the Sales Funnel and a set of conditions which will register as customer passing through each step of the Funnel.
- 0 step (necessary condition) – customer visits a site for the first time
- 1st step (awareness) – visits two site’s pages
- 2nd step (interest) – reviews a product page
- 3rd step (desire) – adds a product to the shopping cart
- 4th step (action) – completes purchase
Second, we need to extract the data that includes sessions where corresponding events occurred. We’ll simulate this data with the following code:
library(tidyverse) library(purrrlyr) library(reshape2) ##### simulating the "real" data ##### set.seed(454) df_raw < - data.frame(customer_id = paste0('id', sample(c(1:5000), replace = TRUE)), date = as.POSIXct(rbeta(10000, 0.7, 10) * 10000000, origin = '2017-01-01', tz = "UTC"), channel = paste0('channel_', sample(c(0:7), 10000, replace = TRUE, prob = c(0.2, 0.12, 0.03, 0.07, 0.15, 0.25, 0.1, 0.08))), site_visit = 1) %>% mutate(two_pages_visit = sample(c(0,1), 10000, replace = TRUE, prob = c(0.8, 0.2)), product_page_visit = ifelse(two_pages_visit == 1, sample(c(0, 1), length(two_pages_visit[which(two_pages_visit == 1)]), replace = TRUE, prob = c(0.75, 0.25)), 0), add_to_cart = ifelse(product_page_visit == 1, sample(c(0, 1), length(product_page_visit[which(product_page_visit == 1)]), replace = TRUE, prob = c(0.1, 0.9)), 0), purchase = ifelse(add_to_cart == 1, sample(c(0, 1), length(add_to_cart[which(add_to_cart == 1)]), replace = TRUE, prob = c(0.02, 0.98)), 0)) %>% dmap_at(c('customer_id', 'channel'), as.character) %>% arrange(date) %>% mutate(session_id = row_number()) %>% arrange(customer_id, session_id) df_raw < - melt(df_raw, id.vars = c('customer_id', 'date', 'channel', 'session_id'), value.name = 'trigger', variable.name = 'event') %>% filter(trigger == 1) %>% select(-trigger) %>% arrange(customer_id, date)
And the data sample looks like:
Next up, the data needs to be preprocessed. For example, it would be useful to replace NA/direct channel with the previous one or separate first-time purchasers from current customers, or even create different Sales Funnels based on new and current customers, segments, locations and so on. I will omit this step but you can find some ideas on preprocessing in my previous blogpost.
The important thing about this approach is that we only have to attribute the initial marketing channel, one that led the customer through their first step. For instance, a customer initially reviews a product page (step 2, interest) and is brought by channel_1. That means any future product page visits from other channels won’t be attributed until the customer makes a purchase and starts a new Sales Funnel journey.
Therefore, we will filter records for each customer and save the first unique event of each step of the Sales Funnel using the following code:
### removing not first events ### df_customers < - df_raw %>% group_by(customer_id, event) %>% filter(date == min(date)) %>% ungroup()
I point your attention that in this way we assume that all customers were first-time buyers, therefore every next purchase as an event will be removed with the above code.
Now, we can use the obtained data frame to compute Sales Funnel’s transition probabilities, importance of Sale Funnel steps, and their weighted importance. According to the method, the higher probability, the lower value of the channel. Therefore, we will calculate the importance of an each step as 1 minus transition probability. After that, we need to weight importances because their sum will be higher than 1. We will do these calculations with the following code:
### Sales Funnel probabilities ### sf_probs < - df_customers %>% group_by(event) %>% summarise(customers_on_step = n()) %>% ungroup() %>% mutate(sf_probs = round(customers_on_step / customers_on_step[event == 'site_visit'], 3), sf_probs_step = round(customers_on_step / lag(customers_on_step), 3), sf_probs_step = ifelse(is.na(sf_probs_step) == TRUE, 1, sf_probs_step), sf_importance = 1 - sf_probs_step, sf_importance_weighted = sf_importance / sum(sf_importance) )
A hint: it can be a good idea to compute Sales Funnel probabilities looking at a limited prior period, for example, 1-3 months. The reason is that customers’ flow or “necks” capacities could vary due to changes on a company’s site or due to changes in marketing campaigns and so on. Therefore, you can analyze the dynamics of the Sales Funnel’s transition probabilities in order to find the appropriate time period.
I can’t publish a blogpost without visualization. This time I suggest another approach for the Sales Funnel visualization that represents all customer journeys through the Sales Funnel with the following code:
### Sales Funnel visualization ### df_customers_plot < - df_customers %>% group_by(event) %>% arrange(channel) %>% mutate(pl = row_number()) %>% ungroup() %>% mutate(pl_new = case_when( event == 'two_pages_visit' ~ round((max(pl[event == 'site_visit']) - max(pl[event == 'two_pages_visit'])) / 2), event == 'product_page_visit' ~ round((max(pl[event == 'site_visit']) - max(pl[event == 'product_page_visit'])) / 2), event == 'add_to_cart' ~ round((max(pl[event == 'site_visit']) - max(pl[event == 'add_to_cart'])) / 2), event == 'purchase' ~ round((max(pl[event == 'site_visit']) - max(pl[event == 'purchase'])) / 2), TRUE ~ 0 ), pl = pl + pl_new) df_customers_plot$event < - factor(df_customers_plot$event, levels = c('purchase', 'add_to_cart', 'product_page_visit', 'two_pages_visit', 'site_visit' )) # color palette cols <- c('#4e79a7', '#f28e2b', '#e15759', '#76b7b2', '#59a14f', '#edc948', '#b07aa1', '#ff9da7', '#9c755f', '#bab0ac') ggplot(df_customers_plot, aes(x = event, y = pl)) + theme_minimal() + scale_colour_manual(values = cols) + coord_flip() + geom_line(aes(group = customer_id, color = as.factor(channel)), size = 0.05) + geom_text(data = sf_probs, aes(x = event, y = 1, label = paste0(sf_probs*100, '%')), size = 4, fontface = 'bold') + guides(color = guide_legend(override.aes = list(size = 2))) + theme(legend.position = 'bottom', legend.direction = "horizontal", panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), plot.title = element_text(size = 20, face = "bold", vjust = 2, color = 'black', lineheight = 0.8), axis.title.y = element_text(size = 16, face = "bold"), axis.title.x = element_blank(), axis.text.x = element_blank(), axis.text.y = element_text(size = 8, angle = 90, hjust = 0.5, vjust = 0.5, face = "plain")) + ggtitle("Sales Funnel visualization - all customers journeys")
Ok, seems we now have everything to make final calculations. In the following code, we will remove all users that didn’t make a purchase. Then, we’ll link weighted importances of the Sales Funnel steps with sessions by event and, at last, summarize them.
### computing attribution ### df_attrib < - df_customers %>% # removing customers without purchase group_by(customer_id) %>% filter(any(as.character(event) == 'purchase')) %>% ungroup() %>% # joining step's importances left_join(., sf_probs %>% select(event, sf_importance_weighted), by = 'event') %>% group_by(channel) %>% summarise(tot_attribution = sum(sf_importance_weighted)) %>% ungroup()
As the result, we’ve obtained the number of conversions that have been distributed by marketing channels:
In the same way you can distribute the revenue by channels.
At the end of the article, I want to share OWOX company’s blog where you can read more about the approach: Funnel Based Attribution Model.
In addition, you can find that OWOX provides an automated system for Marketing Multi-Channel Attribution based on BigQuery. Therefore, if you are not familiar with R or don’t have a suitable data warehouse, I can recommend you to test their service.