Loading [a11y]/accessibility-menu.js
Accurate and Practical Query-by-Example Using Multiple Deep Learning Models and Frame Compression Methods | IEEE Conference Publication | IEEE Xplore

Accurate and Practical Query-by-Example Using Multiple Deep Learning Models and Frame Compression Methods


Abstract:

Recently, studies of spoken term detection (STD) and spoken query STD (SQ-STD), also known as query-by-example (QbE), have been actively pursued. A representative method ...Show More

Abstract:

Recently, studies of spoken term detection (STD) and spoken query STD (SQ-STD), also known as query-by-example (QbE), have been actively pursued. A representative method of QbE is posteriorgram matching using outputs of deep neural networks. However, that method requires much retrieval time and memory size. To address this difficulty, we proposed a maximum likelihood state sequence method (MLSS) for retrieval time reduction. This paper presents a proposal of two methods named "blank-cut (b-cut)" and "frame de-duplication (FDD)" to compress posteriorgram frames, by which we reduce retrieval times and memory sizes. Multiple matching scores are obtained using multiple deep learning models and architectures in the proposed methods. Then they are integrated. We achieved state-of-the-art retrieval accuracy as shown by evaluation experiments using two open test sets of about 30 hr of speech data. Furthermore, the proposed method achieved a retrieval time of less than 1 s and a memory requirement of about 1 GB. These results demonstrated the effectiveness of the proposed method.
Date of Conference: 31 October 2023 - 03 November 2023
Date Added to IEEE Xplore: 20 November 2023
ISBN Information:

ISSN Information:

Conference Location: Taipei, Taiwan

References

References is not available for this document.