Skip to content

Maya-Yagan/ML-suicide-rate-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Suicide Rate Prediction

A comparative machine-learning study to forecast national suicide rates using nine regression algorithms, Genetic Algorithm–based feature selection, and Principal Component Analysis on both public and custom-curated datasets.


Introduction

Suicide-rate forecasting can inform timely public health interventions. This project evaluates nine regression models—KNN, Random Forest, Decision Tree, MLP, Linear Regression, Ridge Regression, and SVR (linear, polynomial, RBF)—under three preprocessing regimes:

  • Baseline (no feature manipulation)
  • Genetic Algorithm–based feature selection
  • Principal Component Analysis (PCA)

Experiments are performed on:

  1. A publicly available Kaggle dataset (1985–2021) with feature pruning due to missingness.
  2. A custom dataset (~84 000 records) merged from WHO, IHME, World Bank, and national sources for richer socio-demographic indicators.

Features

  • Wrapper-based feature selection via a Genetic Algorithm (GAFeatureSelectionCV)
  • Dimensionality reduction via PCA retaining 95 % variance
  • Nine regression algorithms implemented in scikit-learn
  • Nested cross-validation to prevent data leakage
  • Comprehensive performance metrics (MAE, MSE, RMSE)

Datasets

  1. Public Kaggle Dataset (first_dataset/)
    • 31 756 records, 12 original columns
    • HDI, population, and raw counts dropped due to missingness or leakage
  2. Custom-Curated Dataset (second_dataset/)
    • ~84 000 records, 10 curated features
    • Integrated from WHO, IHME (GBD 2021), World Bank, and Statbank Greenland

Note: Raw and preprocessed CSVs are stored in their respective folders.


About

Predict suicide rates using nine ML regression models, GA‑based feature selection, and PCA on public and custom datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors