refactor(app): normalize all recorded audio to 16kHz at capture time#50
Merged
refactor(app): normalize all recorded audio to 16kHz at capture time#50
Conversation
What: App and mic audio are now resampled to 16kHz when saved instead of preserving the native device sample rate. AudioConstants.targetSampleRate (app) and speechSampleRate (audiotap) provide a single named constant. resampleFile() fast-paths with a file copy when already at 16kHz. Reasoning: - Problem: Mickey Mouse / pitch artifacts when app output device runs at a rate other than 48kHz (e.g. 44100, 96000), or when app and mic tracks had mismatched rates that were silently mixed at the wrong ratio. - Decision: resample to 16kHz at the source — MicCaptureHandler via existing AVAudioConverter, DualSourceRecorder via AudioMixer.resample() before saveWAV(). All WAVs on disk are now always 16kHz; pipeline resampleFile() calls become no-ops.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
AudioConstants.targetSampleRate(app target) andspeechSampleRate(audiotap package) replace all hardcoded16000literalsresampleFile()fast-paths with a file copy when the source is already at 16kHz (pipeline calls become no-ops)Problem
Mickey Mouse / pitch artifacts occurred when the system output device ran at a rate other than 48kHz (e.g. 44100Hz on some Macs/AirPods setups), or when the app and mic tracks had mismatched sample rates that were silently mixed at the wrong ratio.
Test plan
swift test)swift test --filter E2E)./scripts/run_app.sh)