A novel implementation of a denoising autoencoder trained within a generative adversarial network (GAN) framework, specifically designed for audio signal denoising tasks.
This repository presents an ultra-lightweight hybrid denoising architecture combining autoencoder and GAN methodologies. The implementation features a generator serving as a primary denoising component, with highly efficient parameter utilization: 282,177 parameters for the generator and 10,577 parameters for the discriminator. This efficient design enables training on standard consumer hardware without requiring specialized GPU capabilities.
Due to computational constraints, the current implementation is optimized for signals sampled at 256 samples/second, processing 1-second duration signals composed of three randomly generated sine waves within the 10Hz - 64Hz frequency range. The model addresses Gaussian white noise contamination within the 0Hz - 128Hz range.
Comparative analysis against traditional lowpass filtering demonstrates complementary strengths: while lowpass filters excel at high-frequency noise elimination, they exhibit limitations within the generated-sine-range. Conversely, our lightweight denoiser shows superior performance in reducing noise within the signal frequency range, albeit with some limitations in complete high-frequency noise elimination.
The training data consists of:
-
Clean Signals:
- Three-component sinusoidal compositions
- Frequency range: 10Hz - 64Hz (uniform random distribution)
- Design ensures minimal training-validation set overlap
-
Noise Generation:
- Gaussian white noise (GWN)
- Parameters: μ = 0, σ = 1.0
- 256 samples per signal
-
Signal Combination:
- Mixed signals generated at SNR = -2
- Paired clean-mixed signal combinations for reconstruction loss computation
-
Dataset Distribution:
- Total samples: 4500
- Training set: 4000 samples
- Validation set: 500 samples
Sample visualization of generated data across different sets:
The generator implements an autoencoder architecture adapted from established signal processing methodologies, optimized for the denoising task.
The discriminator employs a compact architecture designed to differentiate between clean signals and generator outputs effectively.
The training protocol implements a hybrid approach:
-
Discriminator Training:
- Standard GAN methodology
- BCE loss between clean and denoised signals
-
Generator Training:
- Hybrid loss function combining:
- Adversarial loss (BCE-based discriminator deception metric)
- Reconstruction loss (L1 norm between clean and generated signals)
- Hybrid loss function combining:
Training visualization:
Training parameters:
- Initial generator training: 40 epochs
- Combined training: Maximum 200 epochs
- Early stopping patience: 50 epochs
- Early stopping achieved: Epoch 151
- Optimal model weights: Epoch 101 (based on validation metrics)
The model demonstrates strong denoising capabilities:
-
Reconstruction Accuracy:
- Reconstruction Loss: 0.0304 (MSE)
- Indicates excellent signal reconstruction (scale 0-1, lower is better)
- Demonstrates the model's ability to preserve signal integrity
-
Noise Reduction:
- SNR Improvement: 7.2255 dB
- Represents approximately 5.3x reduction in noise power
- Significant improvement in signal clarity
-
Signal Preservation:
- Signal Distortion: 7.8965
- Indicates some modification of original signal components
- Represents the trade-off between noise reduction and signal preservation
The model achieves an impressive balance between noise reduction and signal reconstruction, though with some signal modification. This trade-off is typical in denoising applications, where aggressive noise removal can impact original signal characteristics.
Validation set performance examples:
Representative output comparisons (clean, mixed, denoised, and low-pass filtered signals):
Performance observations:
-
Optimal Performance (Examples 4, 5, 6, 8):
- Near-perfect signal reconstruction
- Minimal residual noise
- Preserved component magnitudes
-
Moderate Success (Examples 1, 2, 7):
- Component magnitude variations
- Residual frequency artifacts
-
Suboptimal Cases (Example 3):
- Signal collapse to single-frequency component
The implementation is optimized for standard laptop configurations, with typical training duration of 10-15 minutes for 300 epochs using current architectures.
- Directory structure:
GAN-Denoiser
├── data
│ ├── resampled
│ ├── train
│ │ ├── clean
│ │ ├── mixed
│ │ └── noise
│ └── validation
│ ├── clean
│ ├── mixed
│ └── noise
└── ...
- Installation:
- Install dependencies from requirements.txt
-
Data Generation:
- Execute create_data.py
- Verify data generation through displayed random signal plots
-
Dataset Validation:
- Run dataset.py for system compatibility verification
-
Model Training:
- Execute train_gan.py
- Adjust hyperparameters as needed:
- Generator architecture
- Discriminator architecture
- Data creation parameters
- Training parameters
-
MathWorks. "Denoise Signals with Generative Adversarial Networks." Signal Processing Toolbox Documentation. This implementation draws significant inspiration from their architectural and training methodologies.
-
Zhang et al. (2024). "Deep Feature Loss for Signal Denoising." Advanced Engineering Informatics. This work influenced our approach to CNN-based denoising strategies.












