Back to portfolio

Customer Churn Prediction

A model that flags which telecom customers are about to leave, so the business can reach them before they go.

PythonPandasScikit-learnStreamlit2025

A telecom company loses customers every month. People cancel because they are unhappy, bored, or found a cheaper plan, and winning a new customer costs far more than keeping one. So the question I wanted to answer was simple: from a customer's account and usage, can we tell who is about to leave? If we can flag them early, the business can step in with an offer or a fix while there is still time.

Source
IBM Telco Customer Churn (public dataset)
Size
7,032 customers
Churn rate
~26.5% had churned

After converting Total Charges to numbers and dropping the 11 blank rows, 7,032 customers remained, each with 20 fields covering demographics, services, contract, and billing, which one-hot encoding expanded into 40 numeric columns.

  1. 01

    Clean the data

    Converted Total Charges from text to numbers, dropped the 11 blank rows, and checked for duplicates (none).

  2. 02

    Encode

    One-hot encoded the categorical fields like contract, internet service, and payment method into numeric columns.

  3. 03

    Compare models

    Trained Logistic Regression, a Decision Tree, and a Random Forest, then compared how well each caught churners.

  4. 04

    Tune for the right metric

    Accuracy hides churn when most customers stay, so I rebalanced the classes and ran GridSearchCV tuned on recall, which catches leavers instead of just being right on average.

  5. 05

    Deploy

    Wrapped the chosen model in a Streamlit app so anyone can enter a customer's details and get a prediction.

83%
of churners caught
recall on the held-out test set
0.83
ROC-AUC
across the strongest models

Accuracy is a trap here: about 74% of customers stay, so a model that blindly predicts "stay" scores 74% and catches no one. I optimized for recall instead and tuned the search on it because missing a customer who leaves costs far more than a false alarm worth a retention email. The model now flags roughly 83% of the customers who actually churn.

What drives churn
#1
Month-to-month contract
#2
Fiber-optic internet
#3
Tenure
#4
Monthly charges
#5
No tech support
#6
Electronic-check payment

Feature importance from the deployed model. One factor dominates: being on a month-to-month contract carries about two-thirds of the decision.

One factor towers over the rest: a month-to-month contract carries roughly two-thirds of the model's decision. After that come fiber-optic internet, short tenure, and high monthly charges. My exploratory analysis pointed the same way. The highest-risk customers sit in a clear danger zone of high bills and low tenure, while almost no one churns once they pass about two years. The pattern is consistent: the less commitment and the higher the bill, the more likely a customer walks.

  • Move month-to-month customers onto one- or two-year contracts. It is by far the strongest signal of who stays.
  • Watch the danger zone: newer customers on high monthly charges, especially on fiber-optic plans, and reach out early.
  • Bundle tech support and online security for at-risk accounts, since the customers without them churn more.

Try the model yourself

Enter a customer's details in the live app and watch the model predict stay or leave.

Open the live predictor