Lightweight Implicit Neural Network
for Binaural Audio Synthesis

Xikun Lu, Fang Liu, Weizhi Shi, Jinqiu Sang*
Shanghai Institute of Artificial Intelligence for Education, East China Normal University, China
School of Computer Science and Technology, East China Normal University, China
Department of Big Data and Information Engineering, Guizhou Industry Polytechnic College, China
* Corresponding author: jqsang@mail.ecnu.edu.cn
ICASSP 2026 (under review)
Please wear headphones when listening.

Abstract

High-fidelity binaural audio synthesis is crucial for immersive experiences, but existing methods require extensive computational resources, limiting their on-device application. To address this, we propose the Lightweight Implicit Neural Network (LINN), a novel two-stage framework. LINN first generates initial estimates using a time-domain warping, which is then refined by an Implicit Binaural Corrector (IBC) module. IBC is an implicit neural network that predicts amplitude and phase corrections directly in the coordinate system, resulting in a highly compact model architecture. Experimental results show that LINN achieves statistically comparable perceptual quality to the best-performing baseline model while significantly improving computational efficiency. Compared to the most efficient existing method, our model has 3.6 times fewer parameters and significantly fewer compute operations (MACs). This demonstrates that our approach effectively addresses the trade-off between synthesis quality and computational efficiency, providing a new solution for high-fidelity on-device spatial audio applications.

Samples from Binaural Speech Datasets