Translate Widget

In the competitive business landscape, every organization strives to cultivate a loyal customer base. This loyalty, often a result of trust and satisfaction with a company’s products and services, is a key driver of repeat business. Consequently, companies are increasingly focusing on identifying and enhancing the critical factors influencing customer loyalty and encouraging repeat investments.

This post will guide you on how to use key driver analysis to identify the essential factors that impact customer satisfaction and, consequently, customer loyalty.

Problem statement

A financial services organization intends to improve customer loyalty¹, recognizing its importance in driving repeat investments. However, the organization currently lacks an understanding of the key drivers that significantly influence customer loyalty within the company.

Analysis Question

What are the key factors influencing customer loyalty within the organization?

Analysis objectives

To understand what factors are important in influencing customer loyalty.
To understand which aspects the organization should invest in to improve customer loyalty.

Methodology

The financial services company carried out a survey. Customers who had interacted with a salesperson were asked to evaluate specific elements of their competence. They were also asked to express their likelihood of reinvesting in the company and to specify the potential amount of their reinvestment. The final question concerned the salesperson’s gender.

Data collection method

The survey included seven questions.

Four were specifically related to key salesperson competency elements:

The salesperson understanding your needs.
The salesperson seems confident.
The salesperson has a recommendation.
The salesperson is knowledgeable.

The customers were asked to grade the salesperson on a scale of 1 to 5, where 1 was ‘very dissatisfied’ and 5 was ‘very satisfied’.

Two question related to loyalty and asked customers to put themselves on a scale of 1 to 5 where:

5 Likelihood of reinvesting (Custloyalty):

Definitely going to go elsewhere with my investments.
Thinking about going elsewhere with my money.
Not sure about where I will place my future investments.
Thinking about investing here again.
Definitely will invest here again.

6 How much they may or may not reinvest (InvestMore):

Definitely will not invest more than now here.
Unlikely to invest more than now here.
Planning to invest 0–50 per cent more than now in this organization.
Planning to invest 50–100 per cent more in this organization.
Going to double my investment or more in this organization.

The final question was about the gender of the sales person.

7 Sex of sales person: 1 = female; 2 = male.

Data preparation

Load packages

Code

library(tidyverse)
library(ggtext)
source("helper_functions.R")
source("theme_msd.R")
library(janitor)

The data set includes responses from 2,507 customers and 8 variables.

Rows: 2,507
Columns: 8
$ Sat1             <dbl> 1, 2, 3, 3, NA, 3, 4, 1, 4, 3, 4, 3, 4, 5, 3, 3, 4, 4…
$ Sat2             <dbl> 1, 3, 5, 4, NA, 3, 4, 2, 5, 4, 4, 4, 4, 5, 4, 4, 4, 4…
$ Sat3             <dbl> 1, 3, 4, 4, NA, 3, 4, 1, NA, 3, 4, 3, 3, 4, 5, 3, 4, …
$ Sat4             <dbl> 3, 3, 5, 3, NA, 3, 5, 1, 5, 3, 4, 3, 4, 5, 4, 4, 4, 4…
$ CustSatMean      <dbl> 1.50, 2.75, 4.25, 3.50, NA, 3.00, 4.25, 1.25, 4.67, 3…
$ Custloyalty      <dbl> 4, 2, 2, 3, 5, 3, 4, 1, 3, 3, 4, 4, 4, 4, 1, 4, 4, 4,…
$ INvestMore       <dbl> 3, 1, 3, 3, 3, 3, 2, 1, 3, NA, 2, 3, 3, 3, 3, 2, NA, …
$ SexOfSalesperson <dbl> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…

All variables in the data set are numeric.
For each variable, there are missing values (NAs).

Code

skimr::skim(cssurvey_raw)

Data summary
Name	cssurvey_raw
Number of rows	2507
Number of columns	8
_______________________
Column type frequency:
numeric	8
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
Sat1	128	0.95	3.47	1.04	1	3	4.00	4	5	▁▃▇▇▃
Sat2	124	0.95	3.79	0.95	1	3	4.00	4	5	▁▁▅▇▅
Sat3	142	0.94	3.43	1.10	1	3	4.00	4	5	▂▃▇▇▅
Sat4	450	0.82	3.59	0.98	1	3	4.00	4	5	▁▂▆▇▃
CustSatMean	66	0.97	3.56	0.91	1	3	3.67	4	5	▁▂▆▇▆
Custloyalty	338	0.87	3.80	1.10	1	3	4.00	5	5	▁▂▃▇▇
INvestMore	476	0.81	2.93	0.82	1	3	3.00	3	5	▁▁▇▂▁
SexOfSalesperson	121	0.95	1.79	0.41	1	2	2.00	2	2	▂▁▁▁▇

Understanding data structure

Data structure:
- The dataset consists of 2.507 rows, as per the description provided by the data source.
- It contains 8 variables. Seven of them represent the questions from the survey, and an additional variable calculates the mean value of the customer satisfaction survey responses.

Understanding variables

Data Dictionary
Problems and Solutions

Variable	Detail	Unit	Range
Sat1	The salesperson understanding your needs	numeric	1-5
Sat2	The salesperson seems confident	numeric	1-5
Sat3	The salesperson has a recommendation	numeric	1-5
Sat4	The salesperson is knowledgeable	numeric	1-5
CustSatMean	Value derived from mathematical operation (mean) of customer responses	numeric	1-5
Custloyalty	Likelihood of reinvesting	numeric	1-5
INvestMore	How much they may or may not reinvest	numeric	1-5
SexOfSalesperson	Gender	numeric	1-2

Variable	Problem	Solution
Sat1, Sat2, Sat3, Sat4	variables’names are not meaningful	rename variables
CustSatMean, Custloyalty, INvestMore	inconsistent entries	rename variables
All except CustSatMean	variable type is not numeric; they are ordinal and binary	change variable type
All except CustSatMean	variable labels	assign labels

Cleaning Data

Clean name and rename variables

Problem
Solution

CustSatMean, Custloyalty, INvestMore have inconsistent entries (e.g. IN..More).

Clean names and rename these variables

Code

# use janitor::clean_name()

(cssurvey_clean <- cssurvey_raw|>
   rename(invest_more = INvestMore,
          cust_loyalty = Custloyalty)|>
    clean_names())

# A tibble: 2,507 × 8
    sat1  sat2  sat3  sat4 cust_sat_mean cust_loyalty invest_more
   <dbl> <dbl> <dbl> <dbl>         <dbl>        <dbl>       <dbl>
 1     1     1     1     3          1.5             4           3
 2     2     3     3     3          2.75            2           1
 3     3     5     4     5          4.25            2           3
 4     3     4     4     3          3.5             3           3
 5    NA    NA    NA    NA         NA               5           3
 6     3     3     3     3          3               3           3
 7     4     4     4     5          4.25            4           2
 8     1     2     1     1          1.25            1           1
 9     4     5    NA     5          4.67            3           3
10     3     4     3     3          3.25            3          NA
# ℹ 2,497 more rows
# ℹ 1 more variable: sex_of_salesperson <dbl>

Assign labels and rename variables

The variable names ‘Sat1’, ‘Sat2’, ‘Sat3’, and ‘Sat4’ lack clarity. Although they are numeric in the data set, the data source classifies them as ordinal variables.
The variable ‘sex_of_salesperson’ could be more appropriately named as ‘gender’.

Create new variables based on the existing variables, rename variables, assign labels, and change variable type in factor.

Code

# Create new variables, assign labels, change variable type in factor

(cssurvey_clean <- cssurvey_clean |>
  mutate(undst_cust_need = factor(sat1,
                         levels = c(1:5),
                         labels = c("Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied")),
         sales_confidence = factor(sat2,
                         levels = c(1:5),
                         labels = c("Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied")),
         give_recomm = factor(sat3,
                         levels = c(1:5),
                         labels = c("Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied")),
         product_know = factor(sat4,
                         levels = c(1:5),
                         labels = c("Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied")),
         gender = factor(sex_of_salesperson,
                           levels = c(1:2),
                           labels = c("female", "male"))))

# A tibble: 2,507 × 13
    sat1  sat2  sat3  sat4 cust_sat_mean cust_loyalty invest_more
   <dbl> <dbl> <dbl> <dbl>         <dbl>        <dbl>       <dbl>
 1     1     1     1     3          1.5             4           3
 2     2     3     3     3          2.75            2           1
 3     3     5     4     5          4.25            2           3
 4     3     4     4     3          3.5             3           3
 5    NA    NA    NA    NA         NA               5           3
 6     3     3     3     3          3               3           3
 7     4     4     4     5          4.25            4           2
 8     1     2     1     1          1.25            1           1
 9     4     5    NA     5          4.67            3           3
10     3     4     3     3          3.25            3          NA
# ℹ 2,497 more rows
# ℹ 6 more variables: sex_of_salesperson <dbl>, undst_cust_need <fct>,
#   sales_confidence <fct>, give_recomm <fct>, product_know <fct>, gender <fct>

Code

cssurvey_clean|>
  skimr::skim()

Data summary
Name	cssurvey_clean
Number of rows	2507
Number of columns	13
_______________________
Column type frequency:
factor	5
numeric	8
________________________
Group variables	None

Variable type: factor

skim_variable	n_missing	complete_rate	ordered	n_unique	top_counts
undst_cust_need	128	0.95	FALSE	5	Sat: 828, Neu: 747, Ver: 393, Dis: 315
sales_confidence	124	0.95	FALSE	5	Sat: 1014, Neu: 620, Ver: 562, Dis: 125
give_recomm	142	0.94	FALSE	5	Sat: 780, Neu: 717, Ver: 417, Dis: 313
product_know	450	0.82	FALSE	5	Sat: 803, Neu: 648, Ver: 360, Dis: 183
gender	121	0.95	FALSE	2	mal: 1884, fem: 502

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
sat1	128	0.95	3.47	1.04	1	3	4.00	4	5	▁▃▇▇▃
sat2	124	0.95	3.79	0.95	1	3	4.00	4	5	▁▁▅▇▅
sat3	142	0.94	3.43	1.10	1	3	4.00	4	5	▂▃▇▇▅
sat4	450	0.82	3.59	0.98	1	3	4.00	4	5	▁▂▆▇▃
cust_sat_mean	66	0.97	3.56	0.91	1	3	3.67	4	5	▁▂▆▇▆
cust_loyalty	338	0.87	3.80	1.10	1	3	4.00	5	5	▁▂▃▇▇
invest_more	476	0.81	2.93	0.82	1	3	3.00	3	5	▁▁▇▂▁
sex_of_salesperson	121	0.95	1.79	0.41	1	2	2.00	2	2	▂▁▁▁▇

Data Analysis

In order to achieve the objectives of the analysis, I will conduct a key driver analysis³. One of the goals of this analysis is to understand the factors influencing customer loyalty. I will start by reviewing the survey response percentages in the customer data.

Percentage of Responses in the Customer Survey Data

Surveyed customers expressed satisfaction and high satisfaction with the ‘The salesperson seems confident’. However, they were less satisfied with ‘The salesperson understanding your needs’ and ‘The salesperson has a recommendation’.

Code

# Get the column names of the initial dataset
initial_order <- colnames(cssurvey_clean)[9:12]


# Calculate percentages for each category for each survey question

kda <- cssurvey_clean |>
  drop_na()|>
  pivot_longer(cols = 9:12, names_to = "question", values_to = "value" )|>
  mutate(question= factor(question,
                          levels= initial_order))|>
  count(question, value) |>
  group_by(question)|>
  mutate(prtc = round(n/sum(n) * 100),
         prtc = ifelse(row_number() == n(), 100 - sum(prtc[-n()]), prtc)) |>
  pivot_wider(id_cols = question, names_from = value, values_from = prtc)

kda$question <- c("The salesperson \nunderstanding your needs",
                 "The salesperson \nseems confident",
                 "The salesperson \nhas a recommendation",
                 "The salesperson \nis knowledgeable")

kda|>
  kableExtra::kable(col.names = c("Question", "Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied"),
                    align = "lccccc",
                    caption = "Customer Satisfaction with Salesperson Competencies (%)")

Customer Satisfaction with Salesperson Competencies (%)
Question	Very Dissatisfied	Dissatisfied	Neutral	Satisfied	Very Satisfied
The salesperson understanding your needs	4	14	33	35	14
The salesperson seems confident	2	6	27	43	22
The salesperson has a recommendation	5	14	31	33	17
The salesperson is knowledgeable	3	9	33	39	16

Code

theme_set(theme_msd() + theme(
  axis.title.y = element_blank(),
  axis.ticks.y = element_blank(),
  axis.text.y = element_text(color = GRAY3, size = 10),
  panel.border = element_blank(),
  axis.line = element_line(),
  axis.title.x = element_text(hjust = 0.03, color = GRAY6, size = 10),
  axis.text.x = element_text(size = 8),
  plot.subtitle = element_markdown(hjust = 0.65),
  axis.line.y = element_blank()
))


df <- kda |>
  rename(Item = "question") |>
  pivot_longer(cols = -Item, names_to = "Answer", values_to = "Value") |>
  mutate(
    Answer = factor(Answer, levels = c("Very Dissatisfied", "Dissatisfied", "Neutral", "Satisfied", "Very Satisfied")),
    Value = as.numeric(Value) /100
  )

df$Answer <- fct_rev(df$Answer)



color_scale <- c(
  "Very Dissatisfied" = GRAY2,
  "Dissatisfied" = GRAY2,
  "Neutral" = GRAY9,
  "Satisfied" = GRAY5,
  "Very Satisfied" = GRAY5)


formatted_subtitle <- paste0(
  "<span style='color:", color_scale[1], "'>**", names(color_scale)[1], "**</span>",
  " | ",
  "<span style='color:", color_scale[2], "'>**", names(color_scale)[2], "**</span>",
  " | ",
  "<span style='color:", color_scale[3], "'>**", names(color_scale)[3], "**</span>",
  " | ",
  "<span style='color:", color_scale[4], "'>**", names(color_scale)[4], "**</span>",
  " | ",
  "<span style='color:", color_scale[5], "'>**", names(color_scale)[5], "**</span>"
)

pt <- df |>
  ggplot(aes(y = Item, x = Value, fill = Answer)) +
  geom_bar(stat = "Identity", width = 0.65, color = "white") +
  scale_fill_manual(values = color_scale, guide = "none") +
  labs(
    title = "Customer Satisfaction with Salesperson Competencies \n",
    subtitle = formatted_subtitle,
    x = "\nPercent of total"
  ) +
  scale_x_continuous(position = "top", 
                     breaks = seq(0,1,by = 0.2),
                     labels = scales::percent_format(accuracy = 1)) +
  lemon::coord_capped_cart(top = "both")

pt |>
  save_and_show_plot(width = 6, height = 4, "FIG01.png")

Correlation

Another approach to analyzing the customer survey data is through correlation. Correlation is a statistical measure of the relationship between two variables, how they are interconnected.

Among the four questions on the survey, at this stage, I aim to identify which ones have the strongest correlation with the question related to customer loyalty: the likelihood of reinvestment.

Code

# Correlation for several pairs of variables
     
cor_cust_loyalty <- cssurvey_clean |>
  select(sat1, sat2, sat3, sat4, cust_loyalty)|>
  cor(method = "spearman",
      use = "pairwise.complete.obs")|>
  round(digits = 2)

survey_quest = c("The salesperson understanding your needs",
                 "The salesperson seems confident",
                 "The salesperson has a recommendation",
                 "The salesperson is knowledgeable")

                        
cor_to_cust_loyalty = c(0.49, 0.37, 0.49, 0.40)
                            
(cor_matrix <- tibble(survey_quest, cor_to_cust_loyalty)|>
    arrange(desc(cor_to_cust_loyalty))|>
  kableExtra::kable(col.names = c("Survey Question", "Correlation to \nCustomer Loyalty"),
                    align = "lr",
                    caption = "Correlation of survey questions to customer loyalty"))

Correlation of survey questions to customer loyalty
Survey Question	Correlation to Customer Loyalty
The salesperson understanding your needs	0.49
The salesperson has a recommendation	0.49
The salesperson is knowledgeable	0.40
The salesperson seems confident	0.37

‘The salesperson understanding your needs’ , ‘The salesperson has a recommendation’ , and ‘The salesperson is knowledgeable’ are the most highly correlated to the question related to loyalty (0.49, respectively 0.40). The company might see these areas to focus in its efforts to improve customer loyalty.

Consolidating the Percentage and Correlation

In this step, I will integrate the customer satisfaction questions and their correlation to customer loyalty into a unified view for enhanced insights. To calculate the customer’s importance of loyalty, I calculated the percentage of dissatisfied and very dissatisfied customers in a single metric, percent dissatisfaction. The ‘Customer Loyalty Importance’ is a calculated metric that combines percent dissatisfaction and the correlation to customer loyalty in one measure (see the table below). It allows us to rank survey questions by importance.

Code

# we need to consolidate data from "kda" data set and "cor_matrix" data set

# define cor_matrix as data frame

cor_matrix <- as.data.frame(cor_matrix)

cor_matrix <- tibble(survey_quest, cor_to_cust_loyalty)


kda_cor <- bind_cols(kda, cor_matrix)
  
            
kda_cor <- kda_cor |>
  mutate(percent_dissatisfied = Dissatisfied + `Very Dissatisfied`,
         percent_satisfied = Satisfied + `Very Satisfied`,
         percent_neutral = Neutral) |>
  group_by(percent_dissatisfied, cor_to_cust_loyalty) |>
  mutate(cust_loyalty_importance = percent_dissatisfied*cor_to_cust_loyalty)|>
  ungroup()
 

kda_consolidate <- kda_cor |>
  select(question, percent_dissatisfied, cor_to_cust_loyalty, cust_loyalty_importance)|>
  arrange(desc(cust_loyalty_importance))|>
  kableExtra::kable(col.names = c("Survey Question", "Percent Dissatisfaction", "Correlation to \nCustomer Loyalty", "Customer Loyalty \nImportance"),
                    align = "lrrr",
                    caption = "Customer Satisfaction items ranked by importance")
  
kda_consolidate

Customer Satisfaction items ranked by importance
Survey Question	Percent Dissatisfaction	Correlation to Customer Loyalty	Customer Loyalty Importance
The salesperson has a recommendation	19	0.49	9.31
The salesperson understanding your needs	18	0.49	8.82
The salesperson is knowledgeable	12	0.40	4.80
The salesperson seems confident	8	0.37	2.96

What is the data from the table telling us?

Code

theme_set(theme_msd() + theme(
  axis.title.y = element_blank(),
  axis.ticks.y = element_blank(),
  axis.text.y = element_text(color = GRAY3, size = 10),
  panel.border = element_blank(),
  axis.line = element_line(),
  axis.title.x = element_text(hjust = 0.03, color = GRAY6, size = 9),
  axis.text.x = element_text(color = GRAY3, size = 10),
  plot.subtitle = element_markdown(hjust = 0),
  axis.line.y = element_blank()
))


kda_consolidate_plot <- kda_cor |>
  mutate(question = fct_reorder(question, cust_loyalty_importance))|>
  ggplot(aes(x = cust_loyalty_importance, y = question))+
  geom_bar(stat = "identity", width = 0.6, color = GRAY2)+
  geom_text(aes(label = cust_loyalty_importance), hjust = -0.15, size = 3.5)+
  labs(
    title = "The sales person having a recommendation and understanding customers \nneeds are the issues that require some improvments \n",
    subtitle = "",
    x = "Customer Loyalty Importance"
  )
  
  
kda_consolidate_plot |>
  save_and_show_plot(width = 6.5, height = 4, "FIG02.png")

Key Driver Analysis Results

The findings of the key driver analysis are presented below in a key driver quadrant. It allows us to see the two intersecting information points: the correlation of the four survey questions to customer loyalty, our chosen indicator to understand (axis x), and the satisfaction of customers’ responses (axis y).

Key drivers for customer loyalty

Improve Weakness: The correlation to customer loyalty is 0.49, and customer satisfaction is low (dissatisfied and very dissatisfied). In the lower right quadrant of the key driver analysis chart, customer satisfaction with the salesperson competency “has a recommendation” is 19%, and with the salesperson competency “understanding your needs” is 18%. On the upper side of this quadrant, it is essential to note the significant percentage of neutral responses for these two questions.
Leverage Great: The correlation to customer loyalty is 0.49, and customer satisfaction is high (satisfied and very satisfied). In the upper right quadrant, customer satisfaction with salesperson competency “has a recommendation” is 50%, and with salesperson competency “understanding your needs” is 49%.
Disregard: The remaining survey questions have a mix of positive and negative customer responses. However, they are not as important for customer loyalty as the other two questions.

Key driver analysis explores a wide range of measures to uncover what is most important in influencing the chosen objective. It helps you understand where to focus your energy and resources to obtain the results you are looking for.

Code

library(glue)
library(grid)

# Data ----
# |- KDA Quadrant ----

# First, I organize columns and rows in the data frame to make easier 
# to work with. I made a column "sum_satisfaction" with 3 categories where
# I summed up the percentage of categories satisfied and very satisfied, dissatisfied and very dissatisfied and neutral to keep the plot simple to visualize

kda_quadrant <- kda_cor |>
    pivot_longer(cols = 2:6, names_to = "satisfaction", values_to = "score")

kda_quadrant <- kda_quadrant|>
  pivot_longer(cols = c(4:6), names_to = "sum_satisfaction", values_to = "percent_score")

# Set the theme for the plot

theme_set(theme_msd()+ theme(
  axis.text.y = element_text(color = GRAY6, size = 9),
  axis.title.x = element_text(color = GRAY6, size = 10, margin = margin(6, 0, 15, 0, "pt")),
  axis.title.y = element_text(color = GRAY6, size = 10),
  axis.text.x = element_text(color = GRAY6, size = 9, margin = margin(0, 6, 0, 15, "pt"))
))

# Set seed for reproducible plot

set.seed(123)


# Plot ----
# |- Palette ----
survey_quest_palette <- list("The salesperson has a recommendation" = "#1A242F",
                             "The salesperson understanding your needs" = "#94989D",
                             "The salesperson seems confident" = "#1A242F",
                             "The salesperson is knowledgeable" = "#1A242F",
                             "dark_text" = "#1A242F",
                             "light_text" = "#94989D")

# |- Shape ----

survey_quest_shape <- list("The salesperson has a recommendation" = 22,
                             "The salesperson understanding your needs" = 21,
                             "The salesperson seems confident" = 0,
                             "The salesperson is knowledgeable" = 1)

# Create text grobs for the labels with a specific font size and color

label1 <- textGrob("Very Dissatisfied", rot = 90, gp = gpar(fontsize = 9, col = GRAY6) )
label2 <- textGrob("Neutral", rot = 90, gp = gpar(fontsize = 9, col = GRAY6))
label3 <- textGrob("Very Satisfied", rot = 90, gp = gpar(fontsize = 9, col = GRAY6))
label4 <- textGrob("Low", gp = gpar(fontsize = 9, col = GRAY6))
label5 <- textGrob("High", gp = gpar(fontsize = 9, col = GRAY6))



# | Quadrant Plot ----
# | - Quadrant Base Panel ----

quadrant_base <- kda_quadrant|>
  ggplot(aes(x = cor_to_cust_loyalty,
             y = percent_score))+
  scale_x_continuous(expand = c(0,0),limits = c(0.3,0.6))+
  scale_y_continuous(expand = c(0,0),
                     limits = c(0,80),
                     breaks = c(0, 30, 50, 80), 
                     labels = c("0" = "0", "30" = "30%", "50" = "50%", "80" = "80%"))+

 # Add the labels to the plot using annotation_custom()
   
 
  annotation_custom(grob = label1, xmin = 0.292, xmax = 0.292, ymin = 9, ymax = 9) +
  annotation_custom(grob = label2, xmin = 0.292, xmax = 0.292, ymin = 36, ymax = 36) +
  annotation_custom(grob = label3, xmin = 0.292, xmax = 0.292, ymin = 65, ymax = 65) +
  annotation_custom(grob = label4, ymin = -2.5, ymax = -2.5, xmin = 0.35, xmax = 0.35)+
  annotation_custom(grob = label5, ymin = -2.5, ymax = -2.5, xmin = 0.55, xmax = 0.55)+ 
  coord_cartesian(clip = "off")+
  theme(axis.line = element_blank())
 
   


# | - Quadrant Chart ----

# Set axis labels and title

labelled_quadrant <-  quadrant_base +
    labs(x = "Correlation to customer loyalty",
         y = "Customer satisfaction survey",
        title = "Customer Loyalty Key Driver Report\n")+
   theme(axis.title.x = element_text(hjust = 0.5,
                                     vjust = 4,
                                     face = "bold"),
         axis.title.y = element_text(hjust = 0.5,
                                     vjust = 0,
                                     face = "bold"),
        plot.title = element_text(color = GRAY2, size= 12),
        plot.title.position = "plot"
         )+
 

  
# add four rectangle type of annotations to fill the four quadrant areas
# create a border and split lines for the quadrant chart
  
  annotate("rect", xmin = 0.45, xmax = 0.60, ymin = 40, ymax = 80, fill= "#F8F9F9")+
  annotate("rect", xmin = 0.3, xmax = 0.60, ymin = 0, ymax = 80 , fill= "#F8F9F9")+
  annotate("rect", xmin = 0.45, xmax = 0.60, ymin = 40, ymax = 80, fill= "white")+
  annotate("rect", xmin = 0.3, xmax = 0.60, ymin = 0, ymax = 80, fill= "white")+
  theme(panel.border = element_rect(colour = "lightgrey", fill = NA, linewidth = 3))+
  geom_hline(yintercept= 40, color = "lightgrey", linewidth =1.5)+
  geom_vline(xintercept= 0.45, color = "lightgrey", linewidth =1.5) +
          
# add label to quadrant area

  geom_label(aes(x = 0.375,
                 y = 77,
                 label = "DISREGARD"),
             label.padding = unit(2, "mm"),
             fill = "lightgrey",
             color = "white")+
  geom_label(aes(x = 0.525,
                 y = 77,
                 label = "LEVERAGE GREAT"),
             label.padding = unit(2, "mm"),
             fill = "lightgrey",
             color = "white")+
  geom_label(aes(x = 0.525,
                 y = 3,
                 label = "IMPROVE WEAKNESS"),
             label.padding = unit(2, "mm"),
             fill = "lightgrey",
             color = "white")+
  
# draw the questions survey to the chart with the position corresponding to their “customer satisfaction percent” value and “correlation to customer” value.
  
   geom_jitter(aes(x = cor_to_cust_loyalty,
             y = percent_score,
             color = survey_quest,
             shape = survey_quest,
             fill = survey_quest),
         size = 5,
         alpha = 0.8)+
  theme( legend.position = "top",
         legend.box = "vertical",
         legend.margin = margin(unit(c(0,-0.45,0,0), "cm")),
         legend.title = element_blank(),
         legend.text = element_text(color = GRAY6, size = 9)) +
  guides(shape = guide_legend(nrow = 2, byrow = TRUE),
         fill = guide_legend(nrow = 2, byrow = TRUE),
         color = guide_legend(nrow = 2, byrow = TRUE))+
  scale_colour_manual(values = survey_quest_palette) +
  scale_shape_manual(values = survey_quest_shape) +
  scale_fill_manual(values = survey_quest_palette)
   


labelled_quadrant

Recommendation

To improve customer loyalty, the company could consider:

Improving salespeople’s ability to understand customer needs. This may include active listening, asking relevant questions, and maintaining accurate records.
Providing training for sales staff on making informed recommendations. This would likely result in a positive return on investment.
Identifying employees skilled at understanding and advising customers from raw data and sharing these skills within the team could also be beneficial.
Learning more about clients who declare themselves neutral and how to convert them into satisfied customers.

Summary

In this post, I shared with you my key driver analysis. Here are the steps I followed:

First, I decided what I wanted to understand: What are the key factors influencing customer loyalty within the organization?
From the data available, I selected the measure to understand: customer loyalty (likelihood of reinvesting).
Understand the data.
Clean the data.
Analyze survey data, report the percentage of responses, and run a correlation to establish the strength of the relationship between potential factors and customer loyalty.
Use the correlation coefficient and the percentage of satisfied and dissatisfied responses to analyze the data.
Consider the strength of each factor’s relationship with customer loyalty and current company performance on those factors to focus on actions that will have the most impact.

Footnotes

The case study and the data description are sourced from the book Predictive HR Analytics : Mastering the HR Metric by Dr. Martin R. Edwards and Kirsten Edwards↩︎
Data available here ↩︎
I learned about this analysis in the book People Analytics For Dummies by Mike West↩︎

Problem statement

Analysis Question

Analysis objectives

Methodology

Data collection method

Data preparation

Load packages

Data Set2

Understanding data structure

Understanding variables

Cleaning Data

Clean name and rename variables

Assign labels and rename variables

Data Analysis

Percentage of Responses in the Customer Survey Data

Correlation

Consolidating the Percentage and Correlation

Key Driver Analysis Results

Recommendation

Summary

Footnotes

Data Set²