Implicit neural representations (INRs) have proven effective in various tasks including image, shape, audio, and video reconstruction. These INRs typically learn the implicit field from sampled input points. This is often done using a single network for the entire domain, imposing many global constraints on a single function. In this paper, we propose a mixture of experts (MoE) implicit neural representation approach that enables learning local piece-wise continuous functions that simultaneously learns to subdivide the domain and fit locally. We show that incorporating a mixture of experts architecture into existing INR formulations provides a boost in speed, accuracy, and memory requirements. Additionally, we introduce novel conditioning and pretraining methods for the gating network that improves convergence to the desired solution. We evaluate the effectiveness of our approach on multiple reconstruction tasks, including surface reconstruction, image reconstruction, and audio signal reconstruction and show improved performance compared to non-MoE methods.
Our method noticeably captures details (e.g. in the toes, nostrils, and eye of the thai statue). The expert selection (provided by the manager) provides some level of segmentation by subdividing space. The error colormap shows that our reconstruction produces small errors (lighter indicates higher distance to the ground truth surface).
Qualitative (left) and quantitative (right) results. Showing image reconstruction (top), gradients (middle), and laplacian (bottom) for (a) GT, (b) SoftPlus, (c) SoftPlus Wider, (d) Our SoftPlus MoE, (e) SIREN, (f) SIREN Wider, (g) Naive MoE, and (h) Our SIREN Neural Experts. The quantitative results (right) report PSNR as training progresses and show that our Neural Experts architecture with Sine activations outperforms all baselines.
Two speakers audio reconstruction is presented. Within each waveform block, the rows represent the ground truth, reconstruction, and error visu- alization from top to bottom. For our Neural Experts we color code the different experts on the reconstructed waveform.
The manager network provides control over the learned representation. It essentially assigns different parts of the coordinate space to different sub-networks (experts). A manager's assignment change can accomodiate sharp discontinuity in the data. It can also be pretrained using some prior (e.g. segmentation). Our experiments show that a random manager initialization performs best as the underlaying segmentation appears naturally from the training process. Manager pretraining. Visualizing the experts selected by the manager after the pretraining stage (top row) and after the full network training (bottom row) for different pretraining ablations.
@inproceedings{ben2024neuralexperts,
title={Neural Experts: Mixture of Experts for Implicit Neural Representations},
author={Ben-Shabat, Yizhak and Hewa Koneputugodage, Chamin and Ramasinghe, Sameera and Gould, Stephen},
booktitle={Advances in Neural Information Processing Systems (NeurIPS},
year={2024}
}