Structure Aware Neural Architecture Search for Mixture of Experts

Author	Petr Babkin
Advisor	Oleg Bakhteev, PhD

Assets

Abstract

The Mixture-of-Experts (MoE) layer, a sparsely activated neural architecture controlled by a routing mechanism, has recently achieved remarkable success across large-scale deep learning tasks. In parallel, Neural Architecture Search (NAS) has emerged as a powerful methodology for automatically discovering high-performing neural network. However, the application of NAS methods to MoE architectures remains an underexplored research area. In this work, we propose an architecture search framework for MoE models, which explicitly leverages the underlying cluster structure of the data. We evaluate the proposed approach on computer vision benchmarks and demonstrate that it outperforms baseline MoE architectures trained on the same datasets in terms of accuracy and computational efficiency.

Citation

If you find our work helpful, please cite us.

@article{babkin2025structure,
    title={Title},
    author={Petr Babkin, Oleg Bakhteev},
    year={2025}
}

Licence

Our project is MIT licensed. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structure Aware Neural Architecture Search for Mixture of Experts

Assets

Abstract

Citation

Licence

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Structure Aware Neural Architecture Search for Mixture of Experts

Assets

Abstract

Citation

Licence