Skip to content

HowRuck/GTFSynq

Repository files navigation

GTFSynq

Spring Java Gradle TimescaleDB Apache Kafka Zed

🎓 Educational Project: GTFSynq is an educational project to explore modern Java, Spring Boot, time-series databases, and real-time data processing. It's functional, but not all features are implemented yet, and the code may not always follow production-grade patterns or performance optimizations.

GTFSynq is a modern Spring Boot transit data platform for ingesting, processing, storing, and analyzing GTFS-RT (General Transit Feed Specification - Real-Time) feeds and GTFS CSV static feed data

It is built with Gradle, runs on Java 26, and is designed to work with Kafka and TimescaleDB for real-time transit data processing and time-series storage

Features

  • Real-time processing: Ingest and process GTFS-RT feeds with low latency
  • Static feed support: Model GTFS CSV data alongside realtime transit updates
  • Time-series storage: Store transit data efficiently in TimescaleDB
  • Kafka integration: Stream transit payloads through Apache Kafka
  • Protobuf support: Encode and decode GTFS-RT messages efficiently
  • Observability: Spring Boot Actuator with Prometheus metrics
  • Docker-based runtime: Run the full stack with a single Compose command

Tech Stack

Component Technology Purpose
Backend Spring Boot 4 Application framework
Build tool Gradle Build, test, and packaging
Language Java 26 Application language
Streaming Apache Kafka Event streaming and transport
Database TimescaleDB Time-series PostgreSQL storage
Serialization Protobuf 4 GTFS-RT message encoding
Monitoring Spring Actuator + Prometheus Health and metrics

Prerequisites

To run the project locally, you need:

  • Java 26
  • Gradle Wrapper
    The repository includes a Gradle wrapper, so use ./gradlew for all build and run commands.
  • Docker and Docker Compose if you want to run Kafka and TimescaleDB in containers.

Quick Start

1. Clone the repository

git clone https://github.com/evogel/GTFSynq.git
cd GTFSynq

2. Build the application with Gradle

./gradlew clean bootJar

This creates the executable application JAR at:

build/libs/app.jar

3. Run the application locally

If Kafka and TimescaleDB are available on your machine, start the app with:

./gradlew bootRun

The application listens on:

  • Application: http://localhost:8888
  • Actuator: http://localhost:8888/actuator
  • Prometheus metrics: http://localhost:8888/actuator/prometheus

Running the Full Stack with Docker

The repository includes a Docker Compose setup that starts:

  • the application
  • Kafka
  • TimescaleDB

To build and start everything:

docker compose -f docker/docker-compose.yaml up -d --build

The app is exposed on:

http://localhost:8888

Stopping the stack

docker compose -f docker/docker-compose.yaml down

Viewing logs

docker compose -f docker/docker-compose.yaml logs -f app

Local Development Flow

A typical local development setup looks like this:

  1. Start infrastructure services with Docker Compose.
  2. Run the app with ./gradlew bootRun.
  3. Make changes in src/main/java.
  4. Re-run ./gradlew test and ./gradlew bootJar as needed.

If you want to run only the infrastructure containers and keep the app on your host machine, start Kafka and TimescaleDB separately using the same Compose file, then run the app locally with ./gradlew.

Project Structure

GTFSynq/
├── build.gradle
├── settings.gradle
├── docker/
│   ├── Dockerfile
│   ├── docker-compose.yaml
│   └── entrypoint.sh
├── modules/
│   ├── shared/
│   ├── ingest-app/
│   ├── store-app/
│   └── api-app/
├── docs/
│   └── architecture.md
├── src/
│   ├── main/
│   │   ├── java/
│   │   ├── proto/
│   │   └── resources/
│   │       ├── application.yaml
│   │       └── db/migration/
│   └── test/
└── README.md

Modules

modules/shared

Contains code reused by multiple apps:

  • GTFS domain models and DTOs
  • protobuf-generated types
  • message envelope encoding/decoding
  • hashing utilities
  • off-heap state store utilities
  • GTFS formatting helpers

modules/ingest-app

Responsible for getting data into Kafka:

  • scheduled GTFS-RT polling
  • native GTFS-RT parsing
  • Kafka publishing
  • feed/source configuration

modules/store-app

Responsible for getting data out of Kafka and into PostgreSQL/TimescaleDB:

  • Kafka Streams consumer
  • deduplication
  • batch persistence
  • Flyway migrations
  • JDBC-based repository writes

modules/api-app

Reserved for the REST API layer:

  • Spring Boot application entry point
  • PostgreSQL-backed read access
  • HTTP endpoints and query models

Runtime Flow

GTFS-RT / GTFS CSV sources
        ↓
   ingest-app
        ↓
      Kafka
        ↓
    store-app
        ↓
 PostgreSQL / TimescaleDB
        ↓
      api-app

Configuration

Key runtime configuration lives in src/main/resources/application.yaml.

Important defaults:

  • Application port: 8888
  • Kafka bootstrap server: localhost:9092
  • PostgreSQL / TimescaleDB: localhost:5432
  • Database name: gtfsynq

When running through Docker Compose, these values are overridden with container hostnames.

Database Migrations

Database schema migrations are managed with Flyway and are located in:

src/main/resources/db/migration

Migrations are applied automatically on startup when Flyway is enabled.

Build and Test Commands

Run tests

./gradlew test

Build the application JAR

./gradlew bootJar

Clean and build from scratch

./gradlew clean build

Run the app from Gradle

./gradlew bootRun

Docker Image Build

The Docker image is built in two stages:

  1. Build the application with the Gradle wrapper inside a Gradle/JDK 26 container.
  2. Copy the generated JAR into a lightweight Java 26 runtime image.

This means the container image always reflects the current Gradle build output from build/libs/app.jar.

Monitoring

The application exposes Actuator endpoints for health and metrics.

Useful endpoints:

  • GET /actuator
  • GET /actuator/health
  • GET /actuator/prometheus

Educational Project

GTFSynq is primarily intended for educational and exploratory use. It is a good place to experiment with:

  • modern Spring Boot development
  • real-time transit feed ingestion
  • Kafka streaming
  • Flyway database migrations
  • TimescaleDB time-series modeling
  • Protobuf-based transport formats

License

This project is licensed under the terms of the repository's LICENSE file.

About

A modern application built with Spring Boot and Angular that processes GTFS (General Transit Feed Specification) data

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors