An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification

Zhang, Chunlei; Ranjan, Shivesh; Hansen, John

doi:10.21437/Odyssey.2018-26

An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification

Chunlei Zhang, Shivesh Ranjan, John Hansen

In this paper, we present transfer learning for deep neural network based text-independent speaker verification, in the presence of a severe mismatch between the enrollment and the test data. Given a pre-trained speaker embedding network developed with out-of domain data, we explore and analyze how this pre-trained model can benefit for the in-domain speaker verification task. Two alternative strategies are investigated to perform transfer learning, i.e., vanilla transfer learning (V-TL) and curriculum learning based transfer learning (CL-TL). The proposed methods are validated on UT-SCOPE-physical speech corpus, where we create a setup to introduce mismatched evaluation conditions with the neutral and the physical task stressed speech. Experimental results confirm the effectiveness of both V-TL and CL-TL techniques. Employing transfer learning based on the pre-trained model, we are able to achieve a +47.7% relative improvement over a conventional i-vector/PLDA system and a +30.6% relative improvement over a recent proposed end-to-end system, respectively.

doi: 10.21437/Odyssey.2018-26

Cite as: Zhang, C., Ranjan, S., Hansen, J. (2018) An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification . Proc. The Speaker and Language Recognition Workshop (Odyssey 2018), 181-186, doi: 10.21437/Odyssey.2018-26

@inproceedings{zhang18_odyssey,
  author={Chunlei Zhang and Shivesh Ranjan and John Hansen},
  title={{An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification	}},
  year=2018,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2018)},
  pages={181--186},
  doi={10.21437/Odyssey.2018-26}
}