Skip to content
View riovic918data's full-sized avatar
  • United States

Block or report riovic918data

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
riovic918data/README.md

Russell Viccaro

Backend Engineer · Data Engineer

FastAPI · Django · ETL Pipelines · Python

Python FastAPI Django Apache Airflow Apache Spark Kafka PostgreSQL Snowflake AWS Terraform


About Me

I'm a Backend & Data Engineer based in the United States, specializing in high-performance REST APIs, asynchronous microservices, and large-scale ETL/ELT data pipelines. I design systems that handle millions of events per day — from ingestion through transformation to analytical delivery — with a focus on reliability, observability, and clean architecture.

  • 🔧 Building production APIs with FastAPI and Django REST Framework
  • 🔄 Designing and operating ETL / ELT pipelines at scale (batch & streaming)
  • ☁️ Cloud-native infrastructure on AWS and GCP
  • 📊 Enabling data teams with warehouse-first architectures using dbt, Airflow, and Spark
  • 🧪 Advocate for test-driven development, typed Python, and CI/CD automation

Architecture Diagrams

1 · FastAPI Microservice Architecture

A production-grade FastAPI service with async request handling, JWT auth, Redis caching, Celery background workers, and PostgreSQL persistence — deployed behind CloudFront and AWS ALB.

FastAPI Microservice Architecture


2 · Django REST Framework — Multi-Tenant SaaS Backend

Schema-per-tenant PostgreSQL isolation, RBAC middleware, pluggable installed apps split across Core / Business / API layers, Celery task queues, and external service integrations.

Django REST Framework - Multi-Tenant SaaS Backend


3 · ETL / ELT Data Pipeline Architecture

End-to-end batch and streaming pipeline: 6 source types → Kafka/Airbyte ingestion → Airflow DAG orchestration → Spark + dbt transformation → Great Expectations quality → Snowflake warehouse → BI / reverse-ETL serving.

ETL ELT Data Pipeline Architecture


4 · Streaming Event Pipeline — Kafka + Flink + ClickHouse

Sub-second real-time pipeline processing 500k+ events/minute: producers → Kafka topics with Schema Registry → Flink jobs (enrich, dedup, aggregate, anomaly detect) with RocksDB state → ClickHouse OLAP + Redis counters → Grafana dashboards.

Streaming Data Pipeline Architecture


5 · Full Tech Stack Map

Tech Stack


Technology Stack

Languages & Runtimes

Category Technologies
Primary Python 3.12, SQL (PostgreSQL dialect, BigQuery SQL, Snowflake SQL)
Secondary Bash / Shell, Go (basics), TypeScript (basics)
Python Ecosystem asyncio, typing, pydantic, dataclasses, mypy, black, ruff

Backend Frameworks

Framework Use Case Key Libraries
FastAPI Async REST APIs, microservices, ML serving endpoints Uvicorn, Gunicorn, SQLAlchemy async, Alembic, Pydantic v2, python-jose, passlib, slowapi
Django Full-stack SaaS apps, admin-heavy platforms, multi-tenant systems DRF, drf-spectacular, SimpleJWT, Celery, django-tenants, django-storages
Flask Lightweight internal tools, simple webhook receivers Flask-RESTful, Flask-SQLAlchemy

Data Engineering

Layer Technologies
Orchestration Apache Airflow 2.x, Prefect, Dagster
Batch Processing Apache Spark (PySpark), dbt Core, Pandas, Polars
Stream Processing Apache Flink, Kafka Streams, Spark Structured Streaming
Message Bus Apache Kafka, AWS SQS/SNS, Redis Streams
CDC / Replication Debezium, AWS DMS, Airbyte
Transformation dbt (staging → intermediate → marts), custom PySpark jobs
Data Quality Great Expectations, dbt Tests, Soda Core

Databases & Storage

Type Technologies
Relational PostgreSQL (primary), MySQL, SQLite (testing)
OLAP / Warehouse Snowflake, Google BigQuery, ClickHouse, AWS Redshift
Lakehouse Delta Lake, Apache Iceberg, AWS Glue Data Catalog
Cache / KV Redis, Memcached
Search Elasticsearch, OpenSearch
Document MongoDB, DynamoDB
Object Store AWS S3, GCS, MinIO

Cloud & Infrastructure

Category Technologies
AWS EC2, ECS, EKS, RDS/Aurora, S3, Glue, EMR, Lambda, SQS, SNS, CloudWatch, IAM
GCP BigQuery, Cloud Composer (Airflow), Dataflow, Cloud Run, GCS
Containers Docker, Docker Compose, Kubernetes (EKS/GKE), Helm
IaC Terraform, AWS CDK, CloudFormation
CI/CD GitHub Actions, GitLab CI, ArgoCD, Jenkins

Observability & Monitoring

Tool Purpose
Prometheus Metrics collection and alerting rules
Grafana Dashboards — infrastructure, pipeline health, API latency
Loki Log aggregation and querying
Sentry Error tracking and performance monitoring
OpenTelemetry Distributed tracing across microservices
DataDog APM and full-stack observability (enterprise projects)
Airflow UI / Flower Pipeline and task monitoring

API Design & Tooling

Concern Approach / Tool
API Style REST (primary), GraphQL (selective), WebSocket (real-time)
Documentation OpenAPI / Swagger (drf-spectacular, FastAPI native)
Authentication JWT (SimpleJWT, python-jose), OAuth2, API Keys
Rate Limiting Redis-backed token bucket, slowapi (FastAPI)
Serialization Pydantic v2, DRF Serializers, marshmallow
Testing pytest, pytest-asyncio, factory_boy, Hypothesis, httpx

Featured Projects

🚀 High-Throughput Event Ingestion Pipeline

Kafka → Flink → ClickHouse pipeline processing 500k+ events/minute with real-time anomaly detection and sub-second dashboard latency. Built deduplication using bloom filters and stateful windowing with RocksDB-backed Flink state.

Stack: Python · Apache Kafka · Apache Flink · ClickHouse · Redis · Grafana


🔄 Multi-Source ETL Platform

Airflow-orchestrated ETL platform ingesting from 12 source systems into Snowflake, serving a 15-person analytics team. Introduced dbt for SQL transformation governance and Great Expectations for automated data quality SLAs.

Stack: Python · Apache Airflow · dbt · Snowflake · Great Expectations · AWS S3 · Airbyte


⚡ FastAPI SaaS Backend

Async REST API backend for a B2B SaaS product — 100k+ daily active users. Implemented async SQLAlchemy with connection pooling, Redis cache-aside pattern, and Celery for background report generation. P99 latency under 120ms.

Stack: FastAPI · PostgreSQL · Redis · Celery · AWS ECS · Terraform · GitHub Actions


🏢 Django Multi-Tenant Platform

PostgreSQL schema-per-tenant Django application supporting 80+ enterprise tenants. Designed custom TenantMiddleware for schema routing, role-based permission system, and Stripe billing integration with webhook handling.

Stack: Django · DRF · PostgreSQL · Celery · Stripe · Docker · AWS RDS


Popular repositories Loading

  1. frontend-challenges frontend-challenges Public

    Forked from felipefialho/frontend-challenges

    💥 Listing some playful open-source's challenges of jobs to test your knowledge

  2. portfolio-estudo portfolio-estudo Public

    Objeto de estudo retirado do canal: Online Mentor

    CSS

  3. curso-python-3 curso-python-3 Public

    Python 3 - Cursos Rápido (cod3r)

    Python

  4. cnoturno cnoturno Public

  5. rafaballerini rafaballerini Public

    Forked from rafaballerini/rafaballerini

  6. flutter-coder flutter-coder Public

    Dart