This project analyzes customer churn in a telecommunications company using machine learning and data visualization techniques. The analysis is based on the IBM Telco Customer Churn Dataset from Kaggle. The analysis includes customer demographics, service usage patterns, and predictive modeling to identify key factors contributing to customer churn.
Figure 1: Churn distribution by status, age distribution, and gender breakdown. Shows overall churn rate (~26.5%), age spread differences, and that churn is balanced across genders.
PRO_1.py: Initial exploratory data analysis and basic statisticsPRO_2.py: Advanced analysis with visualizations and machine learning modelanalysis_summary.py: Executive summary with key findings and recommendationsplots/: Directory containing generated visualizationstelco.csv: Original dataset
- Overall churn rate: 26.54%
- Fiber Optic service has highest churn rate (40.72%)
- Top churn reasons are competitor-related (better devices, better offers)
- Model achieves 92% accuracy in predicting churn
The project generates six key visualizations:
- Customer Demographics Analysis
- Customer Value Analysis
- Service Usage Patterns
- Top Churn Reasons
- Correlation Analysis
- Feature Importance in Churn Prediction
Random Forest Classifier Results:
- Overall Accuracy: 92%
- Precision: 93% (Non-churn), 90% (Churn)
- Recall: 96% (Non-churn), 81% (Churn)
- Python 3.x
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- Clone the repository
- Ensure all required packages are installed
- Run the scripts in order:
python3 PRO_1.py python3 PRO_2.py python3 analysis_summary.py
Based on analysis and industry benchmarks, here are strategies to reduce churn:
- Fiber Optic Quality: Fiber customers churn at ~0.84% vs. ~2% for DSL/cable (Leichtman Research Group, 2023).
- Device Offerings: 43% of customers say device upgrade options drive loyalty (Deloitte, 2022).
- Competitive Pricing: Verizon lowered churn after reducing its 40% premium to ~15%.
- Staff Training: 56% of telecom churn is due to poor service (Accenture, 2021).
- Satisfaction Surveys: Quarterly surveys cut churn by 12β15%.
- Proactive Support: Predictive outreach reduces churn by ~30% (McKinsey, 2022).
- High-Risk Customers: Predictive models improve retention efficiency by 25β30% (BCG, 2022).
- Counter-Offers: Personalized offers reduce churn by 15β20% (Forrester, 2021).
- Early Tenure: Month-to-month churn = 42.7% vs. 2.8β11.3% for long-term (Medium, 2023).
For a deeper dive, see recommendations
License
MIT License
