Skip to content

Latest commit

 

History

History
47 lines (37 loc) · 1.64 KB

File metadata and controls

47 lines (37 loc) · 1.64 KB

Structure Aware Neural Architecture Search for Mixture of Experts

GitHub Contributors

Author Petr Babkin
Advisor Oleg Bakhteev, PhD

Assets

Abstract

The Mixture-of-Experts (MoE) layer, a sparsely activated neural architecture controlled by a routing mechanism, has recently achieved remarkable success across large-scale deep learning tasks. In parallel, Neural Architecture Search (NAS) has emerged as a powerful methodology for automatically discovering high-performing neural network. However, the application of NAS methods to MoE architectures remains an underexplored research area. In this work, we propose an architecture search framework for MoE models, which explicitly leverages the underlying cluster structure of the data. We evaluate the proposed approach on computer vision benchmarks and demonstrate that it outperforms baseline MoE architectures trained on the same datasets in terms of accuracy and computational efficiency.

Citation

If you find our work helpful, please cite us.

@article{babkin2025structure,
    title={Title},
    author={Petr Babkin, Oleg Bakhteev},
    year={2025}
}

Licence

Our project is MIT licensed. See LICENSE for details.