Mathur, A;
Isopoussu, A;
Kawsar, F;
Berthouze, N;
Lane, ND;
(2019)
Mic2Mic: using cycle-consistent generative adversarial networks to overcome microphone variability in speech systems.
In: Eskicioglu, R and Mottola, L and Priyantha, B, (eds.)
IPSN '19: Proceedings of the 18th International Conference on Information Processing in Sensor Networks.
(pp. pp. 169-180).
Association for Computing Machinery (ACM): New York, NY, USA.
Preview |
Text
mathur_ipsn19.pdf - Published Version Download (17MB) | Preview |
Abstract
Mobile and embedded devices are increasingly using microphones and audio-based computational models to infer user context. A major challenge in building systems that combine audio models with commodity microphones is to guarantee their accuracy and robustness in the real-world. Besides many environmental dynamics, a primary factor that impacts the robustness of audio models is microphone variability. In this work, we propose Mic2Mic - a machine-learned system component - which resides in the inference pipeline of audio models and at real-time reduces the variability in audio data caused by microphone-specific factors. Two key considerations for the design of Mic2Mic were: a) to decouple the problem of microphone variability from the audio task, and b) put minimal burden on end-users to provide training data. With these in mind, we apply the principles of cycle-consistent generative adversarial networks (CycleGANs) to learn Mic2Mic using unlabeled and unpaired data collected from different microphones. Our experiments show that Mic2Mic can recover between 66% to 89% of the accuracy lost due to microphone variability for two common audio tasks.




Archive Staff Only
![]() |
View Item |