Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions .github/fixes-docs/FIX_SCHEMA_FILES_FORMAT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Fix Summary: Schema Files Format Issue

## Issue Reference
**Original Issue:** [Bug]: psql/schema.sql is actually MySQL
**Issue Type:** API/Export Issue

## Executive Summary

The file `psql/schema.sql` was incorrectly generated using MySQL's `mysqldump` command instead of PostgreSQL format. Similarly, `sqlserver/schema.sql` was also using MySQL format. This fix adds proper PostgreSQL schema export using `pg_dump --schema-only` and removes the unnecessary SQL Server schema file.

---

## Problem Statement

### What was wrong?
1. `psql/schema.sql` was initially generated using `mysqldump` (MySQL format)
2. `sqlserver/schema.sql` was also generated using `mysqldump` (MySQL format)
3. File headers showed MySQL-specific syntax:
```sql
-- MySQL dump 10.13 Distrib 8.0.44, for Linux (x86_64)
```

### Why was this a problem?
- Users downloading `psql/schema.sql` expecting PostgreSQL format would encounter incompatibilities
- MySQL-specific syntax like `AUTO_INCREMENT`, backtick identifiers, and MySQL comments are not compatible with PostgreSQL
- SQL Server also has different syntax requirements

---

## Changes Made (Final Solution)

### 1. Added proper PostgreSQL schema export
**File:** `.github/workflows/export.yml`
**Change:** Added `pg_dump --schema-only` command to generate proper PostgreSQL schema

```yaml
- name: Export PostgreSQL SQL
env:
PGPASSWORD: postgres
run: |
mkdir -p psql
# Export PostgreSQL schema only (no data)
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --schema-only --clean --if-exists --no-owner --no-acl > psql/schema.sql
# Export individual tables with data
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --inserts --clean --if-exists --no-owner --no-acl -t regions > psql/regions.sql
# ... (other tables)
```

### 2. Removed SQL Server schema generation
**File:** `.github/workflows/export.yml`
**Change:** Removed line that generated MySQL dump as `sqlserver/schema.sql`

```diff
- name: Generate Schema Files
run: |
echo "📋 Generating schema files..."
# Export MySQL schema only (no data)
mysqldump -uroot -proot --no-data --single-transaction --add-drop-table world > sql/schema.sql
- # Also export for other formats
- mysqldump -uroot -proot --no-data --single-transaction --add-drop-table world > sqlserver/schema.sql
echo "✅ Schema files generated"
```

### 3. Deleted SQL Server schema file
**File:** `sqlserver/schema.sql`
**Action:** Removed from repository (MySQL format was incorrect for SQL Server)

### 4. Updated .gitignore
**File:** `.gitignore`
**Change:** Updated to track `psql/schema.sql` and exclude `sqlserver/schema.sql`

```diff
# Keep schema files - they are small and useful
!sql/schema.sql
-!sqlserver/schema.sql
-# Note: psql/schema.sql excluded - PostgreSQL schema is properly exported via pg_dump commands
+!psql/schema.sql
+# Note: sqlserver/schema.sql excluded - not needed for SQL Server exports
```

---

## Rationale

### PostgreSQL Schema Export
Using `pg_dump --schema-only` provides:
- Proper PostgreSQL-compatible SQL syntax
- CREATE TABLE statements with PostgreSQL data types
- Correct constraints and foreign keys for PostgreSQL
- No data, only schema definitions

### SQL Server
SQL Server exports already include schema in the individual table exports. A separate MySQL-formatted schema file would be incompatible and misleading.

---

## Directory Structure (After Fix)

### psql/ (PostgreSQL)
```
psql/
├── schema.sql (PostgreSQL format, schema only - NEW)
├── regions.sql (PostgreSQL format, with data)
├── subregions.sql (PostgreSQL format, with data)
├── countries.sql (PostgreSQL format, with data)
├── states.sql (PostgreSQL format, with data)
├── cities.sql.gz (PostgreSQL format, compressed)
└── world.sql.gz (PostgreSQL format, complete DB)
```

### sqlserver/ (SQL Server)
```
sqlserver/
├── regions.sql (SQL Server format, with data)
├── subregions.sql (SQL Server format, with data)
├── countries.sql (SQL Server format, with data)
├── states.sql (SQL Server format, with data)
├── cities.sql.gz (SQL Server format, compressed)
└── world.sql.gz (SQL Server format, complete DB)
```
**Note:** No `schema.sql` file - schema is included in the individual table exports.

### sql/ (MySQL)
```
sql/
├── schema.sql (MySQL format, schema only)
├── regions.sql (MySQL format, with data)
└── ... (other files)
```

---

## Validation

### Files Checked
✅ YAML syntax validation passed for export.yml
✅ `.gitignore` properly configured
✅ PostgreSQL schema export command uses correct flags
✅ SQL Server MySQL-formatted schema removed

### Command Flags Explanation
- `--schema-only`: Export only schema (CREATE statements), no data
- `-Fp`: Plain text format
- `--clean`: Include DROP statements before CREATE
- `--if-exists`: Use IF EXISTS with DROP statements
- `--no-owner`: Don't include ownership commands
- `--no-acl`: Don't include access privileges

---

## Testing Recommendations

When the workflow runs:
1. Verify `psql/schema.sql` is generated with PostgreSQL syntax
2. Verify `sqlserver/schema.sql` is NOT generated
3. Confirm schema file can be imported into PostgreSQL: `psql -U postgres -d testdb -f psql/schema.sql`
4. Check that CREATE TABLE statements use PostgreSQL data types (e.g., `integer`, `character varying`, `timestamp`)

---

## Data Sources & References

- GitHub Issue: [Bug]: psql/schema.sql is actually MySQL
- Maintainer feedback: Add proper PostgreSQL schema export, remove SQL Server schema
- PostgreSQL pg_dump documentation: https://www.postgresql.org/docs/current/app-pgdump.html
- MySQL mysqldump documentation: https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html

---

**Fix completed:** January 11, 2026
**Updated based on maintainer feedback:** Added proper PostgreSQL schema export with `pg_dump --schema-only`, removed SQL Server schema file
6 changes: 3 additions & 3 deletions .github/workflows/export.yml
Original file line number Diff line number Diff line change
Expand Up @@ -224,16 +224,16 @@ jobs:
echo "📋 Generating schema files..."
# Export MySQL schema only (no data)
mysqldump -uroot -proot --no-data --single-transaction --add-drop-table world > sql/schema.sql
# Also export for other formats
mysqldump -uroot -proot --no-data --single-transaction --add-drop-table world > psql/schema.sql
mysqldump -uroot -proot --no-data --single-transaction --add-drop-table world > sqlserver/schema.sql
echo "✅ Schema files generated"

- name: Export PostgreSQL SQL
env:
PGPASSWORD: postgres
run: |
mkdir -p psql
# Export PostgreSQL schema only (no data)
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --schema-only --clean --if-exists --no-owner --no-acl > psql/schema.sql
# Export individual tables with data
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --inserts --clean --if-exists --no-owner --no-acl -t regions > psql/regions.sql
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --inserts --clean --if-exists --no-owner --no-acl -t subregions > psql/subregions.sql
pg_dump --dbname=postgresql://postgres:postgres@localhost/world -Fp --inserts --clean --if-exists --no-owner --no-acl -t countries > psql/countries.sql
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ csv/translations.csv
# Keep schema files - they are small and useful
!sql/schema.sql
!psql/schema.sql
!sqlserver/schema.sql
# Note: sqlserver/schema.sql excluded - not needed for SQL Server exports

# Always track ALL compressed files (.gz)
!*.sql.gz
Expand Down
2 changes: 1 addition & 1 deletion bin/db/schema.sql
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,6 @@ CREATE TABLE `cities` (
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2025-12-13 9:08:52
-- Dump completed on 2026-01-11 19:44:09

SET FOREIGN_KEY_CHECKS=1;
2 changes: 1 addition & 1 deletion contributions/cities/BQ.json
Original file line number Diff line number Diff line change
Expand Up @@ -327,4 +327,4 @@
"flag": 1,
"wikiDataId": "Q2511504"
}
]
]
2 changes: 1 addition & 1 deletion contributions/countries/countries.json
Original file line number Diff line number Diff line change
Expand Up @@ -16747,4 +16747,4 @@
"flag": 1,
"wikiDataId": "Q26273"
}
]
]
Loading