KPMG Australia: Bike Store Customers Prioritization

This project was completed as part of the KPMG Australia job simulation on The Forage platform. A fictional bike and cycling accessories store sought assistance in optimizing their marketing strategy by identifying which new customers should be prioritized for outreach. To achieve this, I worked with datasets containing demographic and transaction history for existing customers, as well as demographic data for new customers.

The steps I followed were:

  1. Analyzing the datasets to identify patterns and trends.
  2. Calculating RFM (Recency, Frequency, Monetary) scores as indicators of customer profitability for existing customers based on their transaction history.
  3. Building a predictive model to estimate RFM scores for customers.
  4. Applying the model to predict RFM scores for new customers.
  5. Prioritizing new customers based on their predicted RFM scores.

I trained five different models on the data: Random Forest, Decision Trees, SVM, Ridge Regression, and XGBoost, using techniques such as cross-validation and hyperparameter tuning. XGBoost demonstrated the best performance and was used to predict the RFM scores of new customers. The results showed a clear distinction between higher and lower predicted RFM scores. New customers with in the group with distinct higher predicted RFM scores were recommended as priorities for marketing efforts.

The tools used for this project included Python, Scikit-learn, Pandas, Matplotlib, Seaborn, XGBoost, Plotly, and Optuna.

Visualizations from the analysis are provided below. The code for this project is available here.

Numerical features of existing customer data plotted in relation to one another
Categorical features of existing customer data plotted in relation to RFM score
Distribution of predicted RFM scores for new customers

Cover photo by Kindel Media