Skip to content

laijupjoy/MySQL-kafka-S3-Redshift-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MySQL Kafka S3 Redshift Pipeline

The MySQL Kafka S3 Redshift Pipeline enables real-time data transfer from MySQL to Redshift. Leveraging Kafka for streaming, S3 for storage, and Redshift for analytics, it ensures efficient, scalable, and seamless Data integration.

Architecture

Export data from mysql-database to REDSHIFT using kafka

Data flow diagram

Problem Statement:

We need to build an ETL pipeline to dump mysql data base record to redshift using kafka

MY SQL DATABASE

RedShift Dataware house

Red Shift

Approach

  1. Read data from mysql and send to kafka topic and from kafka topic we will dump to s3 bucket mysql-kafka-s3

  2. Read data from s3 bucket and dump in REDSHIFT s3-redshift

Launch entire server setup

docker-compose up

Dump data in mysql db

docker exec -i mysql sh -c 'exec mysql -uroot -p"$MYSQL_ROOT_PASSWORD"' < "./database-dump/mysqlsampledatabase.sql"

We will design Star Schema so that we can export above attached OLTP to OLAP

  1. Redshift setup

  2. Kafka setup

  3. MYSQL KAFKA S3 Project Description

  4. S3 Redshift Project Description

About

The MySQL Kafka S3 Redshift Pipeline enables real-time data transfer from MySQL to Redshift. Leveraging Kafka for streaming, S3 for storage, and Redshift for analytics, it ensures efficient, scalable, and seamless Data integration.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors