Skip to content

SRRayhan066/Titanic-Survival-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic Survival Prediction

A machine learning project that predicts passenger survival on the Titanic using Logistic Regression and Random Forest classifiers.

Project Structure

titanic-survival-prediction/
├── data/
│   └── Titanic-Dataset.csv
├── src/
│   ├── data_loader.py          # Loads CSV data
│   ├── feature_engineering.py  # Preprocessing and feature creation
│   └── model.py                # Model training and evaluation
├── images/
│   └── roc_curve.png           # ROC curve output
├── main.py
└── requirements.txt

Pipeline

  1. Load Data — reads the Titanic CSV dataset
  2. Feature Engineering — handles missing values, encodes categorical columns, scales numerical features, and creates FamilySize from SibSp + Parch
  3. Split Data — 80% training / 20% testing
  4. Train & Evaluate — trains two models and compares them

Features Used

Feature Description
Pclass Passenger class (1, 2, 3)
Sex Gender (encoded: male=1, female=0)
Age Age (scaled)
Fare Ticket fare (scaled)
Embarked Port of embarkation (encoded)
FamilySize SibSp + Parch + 1

Models

  • Logistic Regression — linear classifier, uses best threshold from ROC curve
  • Random Forest — ensemble of 100 decision trees, uses best threshold from ROC curve

ROC Curve

ROC Curve

Setup & Run

python -m venv ai-env
source ai-env/bin/activate
pip install -r requirements.txt
python3 main.py

About

A classification model built using Logistic Regression to predict survival outcomes on the Titanic Dataset. The project includes feature encoding, feature scaling, model evaluation, and ROC-AUC analysis using Python ML tools.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages