Customer Churn Prediction
A model that flags which telecom customers are about to leave, so the business can reach them before they go.
A telecom company loses customers every month. People cancel because they are unhappy, bored, or found a cheaper plan, and winning a new customer costs far more than keeping one. So the question I wanted to answer was simple: from a customer's account and usage, can we tell who is about to leave? If we can flag them early, the business can step in with an offer or a fix while there is still time.
- Source
- IBM Telco Customer Churn (public dataset)
- Size
- 7,032 customers
- Churn rate
- ~26.5% had churned
After converting Total Charges to numbers and dropping the 11 blank rows, 7,032 customers remained, each with 20 fields covering demographics, services, contract, and billing, which one-hot encoding expanded into 40 numeric columns.
- 01
Clean the data
Converted Total Charges from text to numbers, dropped the 11 blank rows, and checked for duplicates (none).
- 02
Encode
One-hot encoded the categorical fields like contract, internet service, and payment method into numeric columns.
- 03
Compare models
Trained Logistic Regression, a Decision Tree, and a Random Forest, then compared how well each caught churners.
- 04
Tune for the right metric
Accuracy hides churn when most customers stay, so I rebalanced the classes and ran GridSearchCV tuned on recall, which catches leavers instead of just being right on average.
- 05
Deploy
Wrapped the chosen model in a Streamlit app so anyone can enter a customer's details and get a prediction.
Accuracy is a trap here: about 74% of customers stay, so a model that blindly predicts "stay" scores 74% and catches no one. I optimized for recall instead and tuned the search on it because missing a customer who leaves costs far more than a false alarm worth a retention email. The model now flags roughly 83% of the customers who actually churn.
Feature importance from the deployed model. One factor dominates: being on a month-to-month contract carries about two-thirds of the decision.
One factor towers over the rest: a month-to-month contract carries roughly two-thirds of the model's decision. After that come fiber-optic internet, short tenure, and high monthly charges. My exploratory analysis pointed the same way. The highest-risk customers sit in a clear danger zone of high bills and low tenure, while almost no one churns once they pass about two years. The pattern is consistent: the less commitment and the higher the bill, the more likely a customer walks.
- Move month-to-month customers onto one- or two-year contracts. It is by far the strongest signal of who stays.
- Watch the danger zone: newer customers on high monthly charges, especially on fiber-optic plans, and reach out early.
- Bundle tech support and online security for at-risk accounts, since the customers without them churn more.
Try the model yourself
Enter a customer's details in the live app and watch the model predict stay or leave.
Open the live predictor