Marketing Segmentation + Machine Learning
Objective
Use machine learning technics including hierarchical clustering, K-means clustering, Naive-Bayes clustering and Random Forest clustering to predict segment membership and predict who is likely to subscribe to the cable service.
Dataset Summary
​
Simulate customer segmentation data for a cable company so that 300 customers fall into 4 segments: "Suburb Mix", "Urban hip", "Travelers", "Moving-Up". Those segments are defined based on the following features:
-
Age
-
Gender
-
Income
-
Kids Count
-
Home Owner or Renter
-
Subscribed to the cable or not.
​
Content
​
-
Descriptive Data Analysis
-
Using command such as summary(), aggregate(), xtabs() to arrange and reshape dataset into the form desired for straightforward description.
-
-
Discreet Data Visualization
-
In histogram and bar chart
-
-
Continuous Data Visualization
-
In bar chart, boxplot and box-and-whiskers
-
-
Statistical Test
-
Chi-square test​
-
T-test: Testing Group Means
-
ANOVA: Testing Multiple Group Means
-
Bayes Statistics
-
-
Predict Membership and Segment Using:
-
Hierarchy Clustering​
-
K-Means Clustering
-
Naive-Bayes Clustering
-
Random Forest Clustering
-
​
Tools
​
R + RStudio
​
Library
library(cluster)
library(ggplot2)
library(factoextra)
library(mclust)
library(scatterplot3d)
library(MASS)
library(poLCA)
library(gplots)
library(e1071)
library(RColorBrewer)
library(psych)
​