🎓 Educational Project: GTFSynq is an educational project to explore modern Java, Spring Boot, time-series databases, and real-time data processing. It's functional, but not all features are implemented yet, and the code may not always follow production-grade patterns or performance optimizations.
GTFSynq is a modern Spring Boot transit data platform for ingesting, processing, storing, and analyzing GTFS-RT (General Transit Feed Specification - Real-Time) feeds and GTFS CSV static feed data
It is built with Gradle, runs on Java 26, and is designed to work with Kafka and TimescaleDB for real-time transit data processing and time-series storage
- Real-time processing: Ingest and process GTFS-RT feeds with low latency
- Static feed support: Model GTFS CSV data alongside realtime transit updates
- Time-series storage: Store transit data efficiently in TimescaleDB
- Kafka integration: Stream transit payloads through Apache Kafka
- Protobuf support: Encode and decode GTFS-RT messages efficiently
- Observability: Spring Boot Actuator with Prometheus metrics
- Docker-based runtime: Run the full stack with a single Compose command
| Component | Technology | Purpose |
|---|---|---|
| Backend | Spring Boot 4 | Application framework |
| Build tool | Gradle | Build, test, and packaging |
| Language | Java 26 | Application language |
| Streaming | Apache Kafka | Event streaming and transport |
| Database | TimescaleDB | Time-series PostgreSQL storage |
| Serialization | Protobuf 4 | GTFS-RT message encoding |
| Monitoring | Spring Actuator + Prometheus | Health and metrics |
To run the project locally, you need:
- Java 26
- Gradle Wrapper
The repository includes a Gradle wrapper, so use./gradlewfor all build and run commands. - Docker and Docker Compose if you want to run Kafka and TimescaleDB in containers.
git clone https://github.com/evogel/GTFSynq.git
cd GTFSynq./gradlew clean bootJarThis creates the executable application JAR at:
build/libs/app.jarIf Kafka and TimescaleDB are available on your machine, start the app with:
./gradlew bootRunThe application listens on:
- Application:
http://localhost:8888 - Actuator:
http://localhost:8888/actuator - Prometheus metrics:
http://localhost:8888/actuator/prometheus
The repository includes a Docker Compose setup that starts:
- the application
- Kafka
- TimescaleDB
To build and start everything:
docker compose -f docker/docker-compose.yaml up -d --buildThe app is exposed on:
http://localhost:8888docker compose -f docker/docker-compose.yaml downdocker compose -f docker/docker-compose.yaml logs -f appA typical local development setup looks like this:
- Start infrastructure services with Docker Compose.
- Run the app with
./gradlew bootRun. - Make changes in
src/main/java. - Re-run
./gradlew testand./gradlew bootJaras needed.
If you want to run only the infrastructure containers and keep the app on your host machine, start Kafka and TimescaleDB separately using the same Compose file, then run the app locally with ./gradlew.
GTFSynq/
├── build.gradle
├── settings.gradle
├── docker/
│ ├── Dockerfile
│ ├── docker-compose.yaml
│ └── entrypoint.sh
├── modules/
│ ├── shared/
│ ├── ingest-app/
│ ├── store-app/
│ └── api-app/
├── docs/
│ └── architecture.md
├── src/
│ ├── main/
│ │ ├── java/
│ │ ├── proto/
│ │ └── resources/
│ │ ├── application.yaml
│ │ └── db/migration/
│ └── test/
└── README.md
Contains code reused by multiple apps:
- GTFS domain models and DTOs
- protobuf-generated types
- message envelope encoding/decoding
- hashing utilities
- off-heap state store utilities
- GTFS formatting helpers
Responsible for getting data into Kafka:
- scheduled GTFS-RT polling
- native GTFS-RT parsing
- Kafka publishing
- feed/source configuration
Responsible for getting data out of Kafka and into PostgreSQL/TimescaleDB:
- Kafka Streams consumer
- deduplication
- batch persistence
- Flyway migrations
- JDBC-based repository writes
Reserved for the REST API layer:
- Spring Boot application entry point
- PostgreSQL-backed read access
- HTTP endpoints and query models
GTFS-RT / GTFS CSV sources
↓
ingest-app
↓
Kafka
↓
store-app
↓
PostgreSQL / TimescaleDB
↓
api-app
Key runtime configuration lives in src/main/resources/application.yaml.
Important defaults:
- Application port:
8888 - Kafka bootstrap server:
localhost:9092 - PostgreSQL / TimescaleDB:
localhost:5432 - Database name:
gtfsynq
When running through Docker Compose, these values are overridden with container hostnames.
Database schema migrations are managed with Flyway and are located in:
src/main/resources/db/migration
Migrations are applied automatically on startup when Flyway is enabled.
./gradlew test./gradlew bootJar./gradlew clean build./gradlew bootRunThe Docker image is built in two stages:
- Build the application with the Gradle wrapper inside a Gradle/JDK 26 container.
- Copy the generated JAR into a lightweight Java 26 runtime image.
This means the container image always reflects the current Gradle build output from build/libs/app.jar.
The application exposes Actuator endpoints for health and metrics.
Useful endpoints:
GET /actuatorGET /actuator/healthGET /actuator/prometheus
GTFSynq is primarily intended for educational and exploratory use. It is a good place to experiment with:
- modern Spring Boot development
- real-time transit feed ingestion
- Kafka streaming
- Flyway database migrations
- TimescaleDB time-series modeling
- Protobuf-based transport formats
This project is licensed under the terms of the repository's LICENSE file.