Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling

Variani, Ehsan; Sainath, Tara N.; Shafran, Izhak; Bacchiani, Michiel

doi:10.21437/Interspeech.2016-1459

Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling

Ehsan Variani, Tara N. Sainath, Izhak Shafran, Michiel Bacchiani

State-of-the-art automatic speech recognition (ASR) systems typically rely on pre-processed features. This paper studies the time-frequency duality in ASR feature extraction methods and proposes extending the standard acoustic model with a complex-valued linear projection layer to learn and optimize features that minimize standard cost functions such as cross-entropy. The proposed Complex Linear Projection (CLP) features achieve superior performance compared to pre-processed Log Mel features.

doi: 10.21437/Interspeech.2016-1459

Cite as: Variani, E., Sainath, T.N., Shafran, I., Bacchiani, M. (2016) Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling. Proc. Interspeech 2016, 808-812, doi: 10.21437/Interspeech.2016-1459

@inproceedings{variani16_interspeech,
  author={Ehsan Variani and Tara N. Sainath and Izhak Shafran and Michiel Bacchiani},
  title={{Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={808--812},
  doi={10.21437/Interspeech.2016-1459}
}