🎧 Generate Transcript CLI

A .NET 8 file-based console application that transcribes audio files locally using Azure AI Foundry Local and formats the output using GitHub Copilot.

🚀 Features

Local AI Transcription - Uses Foundry Local with Whisper models (no cloud required)
Multiple Audio Formats - Supports MP3, WAV, M4A, AAC, FLAC, OGG, WMA
Language Selection - Transcribe in Spanish, English, French, German, Portuguese, Italian or auto-detect
Smart Audio Chunking - Automatically splits long audio files for reliable transcription
Multiple Output Formats:
- Time-Range Format (default)
- Zencastr Format
- SRT (SubRip) Subtitles
Interactive CLI - Beautiful terminal UI with Spectre.Console
Auto-detection - Finds audio files in current directory

📦 Prerequisites

1. Install .NET 8 SDK

Download from: https://dotnet.microsoft.com/download/dotnet/8.0

dotnet --version

2. Install Azure AI Foundry Local

Follow the official documentation:

Verify installation:

foundry --version
foundry model list

3. Install FFmpeg (Required)

FFmpeg is used for audio format conversion and chunking:

Windows (winget):

winget install ffmpeg

Windows (Chocolatey):

choco install ffmpeg

macOS:

brew install ffmpeg

4. Install GitHub Copilot CLI (Optional)

For transcript formatting with AI:

# Install from: https://docs.github.com/en/copilot/github-copilot-in-the-cli
copilot auth login

▶️ Running the App

dotnet generatetranscript.cs

Or with a specific audio file:

# Place your audio file in the same directory
dotnet generatetranscript.cs

🖥️ App Flow

Startup - Displays banner with Spectre.Console
Audio Selection - Auto-detects or prompts for audio file
Audio Processing - Converts to WAV and splits into 5-minute chunks
Language Selection - Choose audio language (Spanish, English, French, etc.)
Format Selection - Choose output format (Time-Range, Zencastr, SRT)
Transcription - Uses Foundry Local Whisper models via OpenAI SDK with language parameter
Formatting - Optionally formats with GitHub Copilot
Output - Saves transcript and shows preview

📝 Output Formats

Time-Range Format

00:00:00 - 00:00:15
Hello and welcome to the show.

00:00:15 - 00:00:30
Today we'll be discussing AI transcription.

Zencastr Format

00:00.00 Speaker 1: Hello and welcome to the show.
00:15.00 Speaker 1: Today we'll be discussing AI transcription.

SRT Format

1
00:00:00,000 --> 00:00:15,000
Hello and welcome to the show.

2
00:00:15,000 --> 00:00:30,000
Today we'll be discussing AI transcription.

🗂️ Supported Audio Formats

Format	Extension
MP3	`.mp3`
WAV	`.wav`
M4A	`.m4a`
AAC	`.aac`
FLAC	`.flac`
OGG	`.ogg`
WMA	`.wma`

🔧 Available Whisper Models

The app automatically selects the best available model:

Model	Size	Best For
whisper-large-v3-turbo	~3GB	Best quality, multilingual
whisper-medium	~1.5GB	Good balance
whisper-small	~500MB	Faster processing
whisper-base	~150MB	Quick transcription
whisper-tiny	~75MB	Testing only

Download a model:

foundry model download whisper-medium

⚠️ Known Limitations

Long Audio Files: Foundry Local Whisper may stop early on very long files. The app automatically chunks audio into 5-minute segments to work around this.
Music Detection: Files starting with music may cause issues. The chunking approach helps mitigate this.
Language Support: When specifying a language, the app uses the OpenAI SDK with Foundry Local's web service API to ensure proper transcription in the original language (not translation). Larger models (medium, large-v3-turbo) work better for non-English audio.
No Timestamps: Current Foundry Local SDK doesn't return word-level timestamps.

🐛 Troubleshooting

"Transcription returned empty result"

Ensure FFmpeg is installed and in PATH
Try a different Whisper model: foundry model download whisper-medium
Check if audio file is valid: ffprobe yourfile.mp3

"Cannot detect audio stream format"

Convert audio to WAV manually: ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav

Copilot formatting not working

Verify: copilot auth status
The app will fall back to raw transcript if Copilot is unavailable

📚 References

📄 License

MIT License - See LICENSE file for details.

👤 Author

Bruno Capuano

Happy transcribing 🎙️

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
generatetranscript.cs		generatetranscript.cs
ntn468_transcript.txt		ntn468_transcript.txt
plan.md		plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎧 Generate Transcript CLI

🚀 Features

📦 Prerequisites

1. Install .NET 8 SDK

2. Install Azure AI Foundry Local

3. Install FFmpeg (Required)

4. Install GitHub Copilot CLI (Optional)

▶️ Running the App

🖥️ App Flow

📝 Output Formats

Time-Range Format

Zencastr Format

SRT Format

🗂️ Supported Audio Formats

🔧 Available Whisper Models

⚠️ Known Limitations

🐛 Troubleshooting

"Transcription returned empty result"

"Cannot detect audio stream format"

Copilot formatting not working

📚 References

📄 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎧 Generate Transcript CLI

🚀 Features

📦 Prerequisites

1. Install .NET 8 SDK

2. Install Azure AI Foundry Local

3. Install FFmpeg (Required)

4. Install GitHub Copilot CLI (Optional)

▶️ Running the App

🖥️ App Flow

📝 Output Formats

Time-Range Format

Zencastr Format

SRT Format

🗂️ Supported Audio Formats

🔧 Available Whisper Models

⚠️ Known Limitations

🐛 Troubleshooting

"Transcription returned empty result"

"Cannot detect audio stream format"

Copilot formatting not working

📚 References

📄 License

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages