Analyzing Credit Card Purchase Patterns Using Clustering (2024)

Using the various clustering models to assess patterns in Credit Card purchases and then make recommendations for the client

Introduction to the Problem

Before diving head-on into the various clustering methods, let us take a short look at the problem we are trying to solve here. A credit card company has, over time, supported data about the various customers it possesses. The company collected data about various facts related to the customers, such as their balances, purchases, cash advances, credit line, etc. The team was tasked to make meaningful insights from the data, and then devise strategies using which the company can target customers and increase credit card sales, and it in turn revenue.

After drawing a brief dataset description, we notice that the data looks like this:

Analyzing Credit Card Purchase Patterns Using Clustering (4)

We start by analyzing various clustering methods and then will provide our recommendations to the clients. Let’s first take a brief look into clustering and K-Means.

Introduction to Clustering

Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). Clustering deals with grouping of data where a pair of similar data points are placed in the same cluster. So, the notion of similarity of matching between data points plays an important role in clustering. Clustering is flat or hierarchical and is implemented in Python using scikit-learn’s cluster package (sklearn.cluster).
Flat Clustering divides the posts into a set of clusters without relating the clusters to each other. The goal is to create clusters that are similar within itself, and dissimilar from others. In hierarchical clustering, the number of clusters does not need to be specified beforehand but instead depends on the dendrogram plot. There is a hierarchy of clusters that are created.

Let’s implement the K-Means clustering method first, and then assess what can be the recommendations.

K-Means Clustering

As noted above, K-Means clustering is a type of flat clustering where we can initialize the model with a set number of clusters. The K-Means model works by choosing k centroids and then groups the data points based on minimum data points, and then reposition the centroids until convergence is achieved, provided the clusters are stable.

Analyzing Credit Card Purchase Patterns Using Clustering (5)

Let’s now make an Elbow plot to assess what an optimal number of clusters should be. Elbow plot, also called the Scree plot, is a plot that gives information about the clusters. The Elbow plot:

Analyzing Credit Card Purchase Patterns Using Clustering (6)

So, looking at the plot above, we assume that the best number of clusters for our analysis would be 4. Now, using 4 clusters, and applying K-Means clustering, our result turns out to be:

Analyzing Credit Card Purchase Patterns Using Clustering (7)

For the above K-Means clustering, we used the initialization method as Random. The problem with Random Initialization is that, with each run, the within-cluster sum of squares changes, since the centroids are chosen at random. This is called the Random Initialization Trap and should be avoided. To avoid this, simply use the k-means++ initialization method, which uses fixed points as centroids and hence gives better results.

Analyzing Credit Card Purchase Patterns Using Clustering (8)

A visualization for clustering which uses K-Means init method is as given below:

Analyzing Credit Card Purchase Patterns Using Clustering (9)

We can see that the K-Means++ initialization method does work better than the random init method.

Hierarchical Clustering

Moving on to other types of clustering methods, we can observe the Hierarchical clustering method. This method does not need clusters to be specified before-hand and rather chooses its clusters by using dendrograms. A dendrogram is a plot that tells about the way the clusters are distributed. A hierarchical clustering begins with each data point in its cluster and goes on combining the clusters until a single cluster is reached. To stop a single cluster from being formed, though, a dendrogram criterion is generally used which takes the longest edge that does not cross a horizontal line as the minimum distance criterion. Any cluster that crosses this line will be chosen in the final model.

A point to note here is that the K-Means cluster aims at reducing the distance, while Hierarchical Clustering tries to reduce the variance in the clusters. Therefore, HC provide better and sharper clusters, though they are suboptimal because they are not separated by distance.

Analyzing Credit Card Purchase Patterns Using Clustering (10)

Hierarchical Clustering can be implemented in python by using AgglomerativeClustering() from the scikit-learn.cluster package. The dendrogram for our analysis looks like:

Analyzing Credit Card Purchase Patterns Using Clustering (11)

From above, we know that we can choose the number of clusters to be 3. After doing so, and fitting the above method, we can observe that the clusters look like:

Analyzing Credit Card Purchase Patterns Using Clustering (12)

Density-Based Clustering

Density-based clustering methods are based on distributing points according to the various densities of the clusters. DBScan Clustering is a clustering method that uses Density-based methods rather than distance-based clustering in K-Means and HC. The full name of DBSCAN is Density-Based Spatial Clustering of Applications with Noise. Upon fitting the DBSCAN method to the credit card dataset and then visualizing the clusters, we get:

Analyzing Credit Card Purchase Patterns Using Clustering (13)

Upon looking at the analysis by DBSCAN above, it was observed with more clarity that the clusters have a more non-linear shape in this, and hence, these types of clustering methods should be used when the data is not linearly separable.

Marketing Insights from Data

The following can be some inputs for marketing strategies:

High Balance, High Purchase — These people made expensive purchases but they also had higher balances to support these purchases. They also made large payments and can be the target for market research.

High Balance, Low Purchase (Higher purchase values)- These are the people who had higher balances but made lower purchases and have medium or high credit limits and took out large cash advances.

Medium Balance, Medium Purchase — These customers did not have low or high balances and they also did not make big or small purchases but they did everything at a medium level.

Frugal Customers (low balance, low purchase) — These are the customers that made the smallest purchases and since their credit limit was also low, this means that the customers did not make these purchases frequently. Therefore, it can be assumed that these customers churned out, and marketing strategies can be devised to reduce this churn.

So a marketing strategy that looked at the four groups can be highly effective to solve the problem.

Pros and Cons of Clustering Methods

K-Means: These methods are simple to understand, easily adaptable, and they work well on small or large datasets but we need to choose the number of clusters

HC Agglomerative Clustering: The optimal number of clusters in these can be obtained by the model itself, and there can be a practical visualization with the dendrogram but the model is not appropriate for large datasets.’

Please find the full implementation of this project on my Github here.