|
1 | | -## Telco Customer Churn Analysis |
| 1 | +# Privacy-First Media Mix Modeling Toolkit |
2 | 2 |
|
3 | | -### 📊 Project Overview |
| 3 | +## Overview |
4 | 4 |
|
5 | | -This project analyzes customer churn in a telecommunications company using machine learning and data visualization techniques. The analysis is based on the IBM Telco Customer Churn Dataset from Kaggle. The analysis includes customer demographics, service usage patterns, and predictive modeling to identify key factors contributing to customer churn. |
| 5 | +This repository provides a toolkit for media mix modeling that respects user privacy. Marketing teams and analysts can estimate the incremental impact of different marketing channels (e.g., TV, search, social, email) on key outcomes such as conversions or revenue without relying on user‑level tracking. Instead, the toolkit uses aggregated and anonymized data to build robust models. |
6 | 6 |
|
7 | | - |
| 7 | +## Features |
8 | 8 |
|
| 9 | +- **Aggregated Data Pipelines**: Ingest channel‑level spend, impressions, and conversions aggregated over time, ensuring no personal data is collected. |
| 10 | +- **Modeling Frameworks**: Includes baseline linear models and advanced Bayesian hierarchical models to estimate channel contribution while accounting for saturation and ad‑stock effects. |
| 11 | +- **Privacy Preservation**: Demonstrates how to apply techniques such as differential privacy to add noise to input data so individual consumers cannot be identified. |
| 12 | +- **Visualization Tools**: Generate charts that show marginal return curves, channel saturation, and expected lift versus spend, helping stakeholders understand media efficiency. |
| 13 | +- **Extensible Design**: Modular codebase so analysts can plug in their own data sources, priors, and model structures. |
9 | 14 |
|
| 15 | +## Getting Started |
10 | 16 |
|
| 17 | +1. Clone the repository and install dependencies listed in `requirements.txt`. |
| 18 | +2. Place your aggregated channel data in the `data/` directory following the provided schema. |
| 19 | +3. Run the example notebook in `PRO_1.py` to explore a simple MMM using synthetic data. |
| 20 | +4. Use `analysis_summary.py` to produce a summary report of channel efficiencies. |
| 21 | +5. Check `recommendations.md` for guidance on interpreting model outputs and making investment decisions. |
11 | 22 |
|
| 23 | +## Business Impact |
12 | 24 |
|
13 | | -### Figure 1: Churn distribution by status, age distribution, and gender breakdown. Shows overall churn rate (~26.5%), age spread differences, and that churn is balanced across genders. |
14 | | - |
15 | | - |
16 | | -## Project Structure |
17 | | -- `PRO_1.py`: Initial exploratory data analysis and basic statistics |
18 | | -- `PRO_2.py`: Advanced analysis with visualizations and machine learning model |
19 | | -- `analysis_summary.py`: Executive summary with key findings and recommendations |
20 | | -- `plots/`: Directory containing generated visualizations |
21 | | -- `telco.csv`: Original dataset |
22 | | - |
23 | | -## Key Findings |
24 | | -- Overall churn rate: 26.54% |
25 | | -- Fiber Optic service has highest churn rate (40.72%) |
26 | | -- Top churn reasons are competitor-related (better devices, better offers) |
27 | | -- Model achieves 92% accuracy in predicting churn |
28 | | - |
29 | | -## Visualizations |
30 | | -The project generates six key visualizations: |
31 | | -- Customer Demographics Analysis |
32 | | -- Customer Value Analysis |
33 | | -- Service Usage Patterns |
34 | | -- Top Churn Reasons |
35 | | -- Correlation Analysis |
36 | | -- Feature Importance in Churn Prediction |
37 | | - |
38 | | - |
39 | | - |
40 | | - |
41 | | -## Model Performance |
42 | | -Random Forest Classifier Results: |
43 | | -- Overall Accuracy: 92% |
44 | | -- Precision: 93% (Non-churn), 90% (Churn) |
45 | | -- Recall: 96% (Non-churn), 81% (Churn) |
46 | | - |
47 | | - |
48 | | -## Requirements |
49 | | -- Python 3.x |
50 | | -- pandas |
51 | | -- numpy |
52 | | -- matplotlib |
53 | | -- seaborn |
54 | | -- scikit-learn |
55 | | - |
56 | | -## Usage |
57 | | -1. Clone the repository |
58 | | -2. Ensure all required packages are installed |
59 | | -3. Run the scripts in order: |
60 | | - ```bash |
61 | | - python3 PRO_1.py |
62 | | - python3 PRO_2.py |
63 | | - python3 analysis_summary.py |
64 | | - ``` |
65 | | - |
66 | | -## 📊 Recommendations |
67 | | - |
68 | | -Based on analysis and industry benchmarks, here are strategies to reduce churn: |
69 | | - |
70 | | -### Service Improvement: |
71 | | - |
72 | | -1. Fiber Optic Quality: Fiber customers churn at ~0.84% vs. ~2% for DSL/cable (Leichtman Research Group, 2023). |
73 | | -2. Device Offerings: 43% of customers say device upgrade options drive loyalty (Deloitte, 2022). |
74 | | -3. Competitive Pricing: Verizon lowered churn after reducing its 40% premium to ~15%. |
75 | | - |
76 | | -### Customer Support: |
77 | | - |
78 | | -1. Staff Training: 56% of telecom churn is due to poor service (Accenture, 2021). |
79 | | -2. Satisfaction Surveys: Quarterly surveys cut churn by 12–15%. |
80 | | -3. Proactive Support: Predictive outreach reduces churn by ~30% (McKinsey, 2022). |
81 | | - |
82 | | -### Retention Strategy: |
83 | | - |
84 | | -1. High-Risk Customers: Predictive models improve retention efficiency by 25–30% (BCG, 2022). |
85 | | -2. Counter-Offers: Personalized offers reduce churn by 15–20% (Forrester, 2021). |
86 | | -3. Early Tenure: Month-to-month churn = 42.7% vs. 2.8–11.3% for long-term (Medium, 2023). |
87 | | - |
88 | | - |
89 | | -------------------------------------------- |
90 | | -**For a deeper dive, see** [recommendations](https://github.com/Dennis-J-Carroll/telco-churn-analysis/blob/main/recommendations.md) |
91 | | - |
92 | | -License |
93 | | - |
94 | | -MIT License |
| 25 | +By modeling marketing spend at an aggregated level and applying privacy‑preserving techniques, this toolkit allows companies to optimize their media budgets without violating consumer trust or privacy regulations. Teams can identify the most efficient channels and reallocate budgets to maximize ROI while complying with privacy laws. |
0 commit comments