Abstract:
A novel Statistical Algorithm for F0 Estimation (SAFE) is proposed to improve the accuracy of F0 estimation under both clean and noisy conditions. Prominent signal-to-noi...Show MoreMetadata
Abstract:
A novel Statistical Algorithm for F0 Estimation (SAFE) is proposed to improve the accuracy of F0 estimation under both clean and noisy conditions. Prominent signal-to-noise ratio (SNR) peaks in speech spectra constitute a robust information source from which F0 can be inferred. A probabilistic framework is proposed to model the effect of noise on voiced speech spectra. Prominent SNR peaks in the low-frequency band (0 - 1000 Hz) are important to F0 estimation, and prominent SNR peaks in the middle and high-frequency bands (1000-3000 Hz) are also useful supplemental information to F0 estimation under noisy conditions, especially the babble noise condition. Experiments show that the SAFE algorithm has the lowest gross pitch errors (GPEs) compared to prevailing F0 trackers in white and babble noise conditions at low SNRs. Experimental results also show that SAFE is robust in maintaining a low mean and standard deviation of the fine pitch errors (MFPE and SDFPE) in noise. The code of SAFE is available at http://www.ee.ucla.edu/~weichu/safe.
Published in: IEEE Transactions on Audio, Speech, and Language Processing ( Volume: 20, Issue: 3, March 2012)