GenAI S2T Energy Demo

Demo: Using GenAI to create Source-to-Target documentation for SQL, Python, DAX, and M code

⚠️ Please note: This repo is tailored for Business Intelligence Analysts who work with Power BI and SQL, and less frequently with Python and Git.

📋 Table of Contents

Purpose
Data Model
Prerequisites
Quick Start
Project Structure
Setup Instructions
Azure Data Studio Setup
Running Scripts
Power BI Report
Troubleshooting
License
Contributing

🎯 Purpose

Purpose 1: GenAI Documentation Demo

The main purpose of this simple data model is to serve as props and scene for demonstrating how to build Source-to-Target (S2T) documentation using GenAI for:

🐍 Python code
🗃️ SQL code
Ⓜ️ M (Power Query) code
📊 DAX code

⚠️ Important: Due to this demo purpose, some data flow and development rules were bent. Please focus on the use case rather than the data model and Power BI report—they are background only.

Purpose 2: Portable BI Prototyping

This repo may be useful for you if:

✅ You need a quick way to prototype with Power BI
✅ You need a SQL database without server installation
✅ You need to ingest various types of data (CSV, pdf, etc.)
✅ Everything needs to be portable (no database server required)

Solution: DuckDB — a lightweight, in-process SQL database that runs entirely in Python. No installation, no server, just a single file.

📊 Data Model

💟 Credits

For demo model I have used transformed data from:

https://github.com/owid/energy-data

https://ourworldindata.org/energy

📦 Prerequisites

Tool	Required	Download
Python 3.11+	✅ Yes	python.org
Power BI Desktop	✅ Yes	Microsoft Store
Azure Data Studio	Optional	Microsoft
VS Code	Optional	code.visualstudio.com
Git	Optional	git-scm.com

🚀 Quick Start

If you want to recreate the proces of setting up prototype please follow below steps:

1. Clone repo

git clone https://github.com/datameisterpl/genai-s2t-energy-demo-public.git
cd genai-s2t-energy-demo

2. Install Python packages

pip install duckdb pandas matplotlib

3. Setup database

python scripts/python/00_setup_database.py

4. Run all ingestion scripts

python scripts/python/01_ingest_iso_to_region.py
python scripts/python/01_ingest_GDP.py
python scripts/python/01_ingest_population.py
python scripts/python/01_ingest_energy_data.py

5. SQL Transformation

A) Use Jupyter Notebook

pip install jupyterlab jupysql sqlalchemy duckdb duckdb-engine pandas
ensure you have Jupyter extension enable in Visual Studio
run

%load_ext sql
%sql duckdb:///../data/dev_worldtrend.duckdb

now use %%sql at the begining of new code cell and you enjoy working with SQL in Database from Visual Studio
see notebooks\sql_data_analytics.ipynb for examples

B) Use Azure Data Studio

Create new Notebook, use Python3 Kernel, connect to DuckBD (see details below)
As the tool will be retired by Microsoft soon consider option 1

6. Open Power BI report

Navigate to: powerbi/Energy Report.pbip to open Power BI (from Power BI Desktop)
Data are loaded via Python connection
Ensure you update <REPO_PATH> on every table to location of your repo

7. (Optional) Export Samples for GenAI Prompt

update config.py file to define your OUTPUT_PATH
run sql_draft_scripts\print_all_tables.py

🗂️ Project Structure

genai-s2t-energy-demo/
│
├── 📄 README.md                    # This file
├── 📄 config_template.py           # Template for output path config
├── 📄 config.py                    # Your personal config (gitignored)
├── 📄 .gitignore                   # Git ignore rules
│
├── 📁 data/
│   ├── 📁 raw/                     # Source CSV files
│   │   ├── energy_data.csv
│   │   ├── GDP.csv
│   │   ├── iso_to_region.csv
│   │   └── population.csv
│   └── 📄 dev_worldtrend.duckdb    # DuckDB database file
│
├── 📁 images/                      # Documentation images
│   └── data_model.jpg
│
├── 📁 notebooks/                   # Jupyter notebooks
│   └── sql_data_analytics.ipynb    # SQL analytics in notebook
│
├── 📁 scripts/
│   ├── 📁 python/                  # Python ingestion scripts
│   │   ├── DEV_setup_database.py
│   │   ├── 00_explore_*.py         # Data exploration scripts
│   │   ├── 01_ingest_*.py          # Data ingestion scripts
│   │   └── 📁 pbi_ingestion/       # Power BI data source scripts
│   │       ├── 01_silver_countries_and_regions.py
│   │       └── 02_silver_countries_all_data.py
│   └── 📁 sql/                     # SQL transformation scripts
│       ├── 01_silver_countries_and_regions.sql
│       └── 02_silver_countries_all_data.sql
│
├── 📁 powerbi/                     # Power BI Project
│   ├── Energy Report.pbip
│   ├── 📁 Energy Report.Report/    # Report visuals & pages
│   │   └── 📁 definition/pages/
│   │       └── 📁 visuals/
│   │           └── *.json          # Visual configurations
│   └── 📁 Energy Report.SemanticModel/
│       └── 📁 definition/tables/   # Data model tables (TMDL)
│           ├── Calendar.tmdl
│           ├── DIM_Countries_and_Regions.tmdl
│           ├── FCT__Countries_Gold_Data.tmdl
│           ├── bridge_year.tmdl
│           └── _Measures.tmdl
│
└── 📁 sql_draft_scripts/           # Working SQL scripts
    ├── copy_tmdl_files.py
    ├── print_all_tables.py
    ├── print_sample.py
    └── tables_display.py

🛠️ Setup Instructions

Update Path Placeholders

After cloning, find and replace the placeholder <REPO_PATH> with your local path in these files:

SQL files — scripts/sql/*.sql → connection strings
TMDL files — powerbi/Energy Report.SemanticModel/definition/tables/*.tmdl → Python script paths

Example:

Find: <REPO_PATH>
Replace: C:\\Users\\YourName\\genai-s2t-energy-demo

💡 Tip: In VS Code, use Ctrl+Shift+H to find and replace across all files.

(Optional) Configure Output Path

If you want to export sample data to a custom folder:

Copy config_template.py → config.py
Edit config.py:

OUTPUT_PATH = r'C:\\Your\\Custom\\Output\\Folder'

🪟 Azure Data Studio Setup

Azure Data Studio can connect to DuckDB using Python notebooks.

Step 1: Create New Notebook

File → New Notebook → Kernel: Python

Step 2: Run Setup Cell

Copy this code into the first cell and run it:

# === SETUP: Run this cell first each session ===
import duckdb
import pandas as pd
from IPython.display import display, HTML

# Display settings
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

# Connection (read + write)
# ⚠️ UPDATE THIS PATH to your repo location
DB_PATH = r'<REPO_PATH>\\data\\DEV_WorldTrend.duckdb'
con = duckdb.connect(DB_PATH)

# Helper function for nice SQL display
def sql(query: str):
    result = con.execute(query).df()
    html = f"""
    <div style="overflow-x: auto; width: 100%;">
        {result.to_html(index=False)}
    </div>
    <p><b>{len(result)} rows</b></p>
    """
    display(HTML(html))
    return result

print("✅ Connected to DEV_WorldTrend.duckdb")

Step 3: Query Your Data

Now you can run SQL queries in new cells:

sql("SHOW ALL TABLES")

sql("SELECT * FROM bronze.iso_to_region LIMIT 10")

sql(
    """
    SELECT country, year, coal_consumption 
    FROM bronze.energy_data 
    WHERE iso_code = 'DEU'
    ORDER BY year DESC
    LIMIT 20
    """
)

Step 4: Close Connection (End of Session)

con.close()
print("🔒 Connection closed")

Important: Close the connection before committing to Git or refreshing Power BI, otherwise you'll get database locked errors.

▶️ Running Scripts

All scripts should be run from the repository root folder:

cd genai-s2t-energy-demo

# ✅ Correct
python scripts/python/01_ingest_population.py

# ❌ Wrong (don't run from inside scripts folder)
cd scripts/python
python 01_ingest_population.py

📈 Power BI Report

Opening the Report

Navigate to powerbi/Energy Report.pbip and open it in Power BI Desktop.

Data Refresh

Close any DuckDB connections (notebooks, scripts).
In Power BI: Home → Refresh.

Connection Issues?

If Power BI shows connection errors:

Check that <REPO_PATH> is replaced in TMDL files.
Ensure the DuckDB database exists: data/DEV_WorldTrend.duckdb.
Verify Python path in Power BI: File → Options → Python scripting.

🧰 Troubleshooting

"Database is locked" error

Cause: Another process has the DuckDB file open.
Fix:

Close Azure Data Studio notebooks
Close any Python scripts
Restart Power BI Desktop

"No module named 'duckdb'" error

Fix:

pip install duckdb

"No module named 'matplotlib'" error in Power BI

Fix:

pip install matplotlib

Power BI requires matplotlib even if your script doesn't use it.

Scripts fail with "file not found"

Cause: Running script from the wrong folder.
Fix: Always run from repo root:

cd genai-s2t-energy-demo
python scripts/python/your_script.py

Path issues after cloning

Fix: Search and replace <REPO_PATH> in all files:

VS Code: Ctrl+Shift+H
Find: <REPO_PATH>
Replace: C:\\Your\\Actual\\Path\\genai-s2t-energy-demo

📝 License

MIT License — see LICENSE file.

🤝 Contributing

This is a demo repository. Feel free to fork and adapt for your own use cases.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
images		images
notebooks		notebooks
powerbi		powerbi
scripts		scripts
sql_draft_scripts		sql_draft_scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_template.py		config_template.py

Folders and files

Latest commit

History

Repository files navigation

GenAI S2T Energy Demo

📋 Table of Contents

🎯 Purpose

Purpose 1: GenAI Documentation Demo

Purpose 2: Portable BI Prototyping

📊 Data Model

💟 Credits

📦 Prerequisites

🚀 Quick Start

1. Clone repo

2. Install Python packages

3. Setup database

4. Run all ingestion scripts

5. SQL Transformation

A) Use Jupyter Notebook

B) Use Azure Data Studio

6. Open Power BI report

7. (Optional) Export Samples for GenAI Prompt

🗂️ Project Structure

🛠️ Setup Instructions

Update Path Placeholders

(Optional) Configure Output Path

🪟 Azure Data Studio Setup

Step 1: Create New Notebook

Step 2: Run Setup Cell

Step 3: Query Your Data

Step 4: Close Connection (End of Session)

▶️ Running Scripts

📈 Power BI Report

Opening the Report

Data Refresh

Connection Issues?

🧰 Troubleshooting

"Database is locked" error

"No module named 'duckdb'" error

"No module named 'matplotlib'" error in Power BI

Scripts fail with "file not found"

Path issues after cloning

📝 License

🤝 Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages