Skip to content

Latest commit

 

History

History
46 lines (28 loc) · 1.36 KB

File metadata and controls

46 lines (28 loc) · 1.36 KB

MySQL Kafka S3 Redshift Pipeline

The MySQL Kafka S3 Redshift Pipeline enables real-time data transfer from MySQL to Redshift. Leveraging Kafka for streaming, S3 for storage, and Redshift for analytics, it ensures efficient, scalable, and seamless Data integration.

Architecture

Export data from mysql-database to REDSHIFT using kafka

Data flow diagram

Problem Statement:

We need to build an ETL pipeline to dump mysql data base record to redshift using kafka

MY SQL DATABASE

RedShift Dataware house

Red Shift

Approach

  1. Read data from mysql and send to kafka topic and from kafka topic we will dump to s3 bucket mysql-kafka-s3

  2. Read data from s3 bucket and dump in REDSHIFT s3-redshift

Launch entire server setup

docker-compose up

Dump data in mysql db

docker exec -i mysql sh -c 'exec mysql -uroot -p"$MYSQL_ROOT_PASSWORD"' < "./database-dump/mysqlsampledatabase.sql"

We will design Star Schema so that we can export above attached OLTP to OLAP

  1. Redshift setup

  2. Kafka setup

  3. MYSQL KAFKA S3 Project Description

  4. S3 Redshift Project Description