Abstract:
The hidden Markov model (HMM) is widely popular as the de facto tool for representing temporal data; in this paper, we add to its utility in the sequence clustering domai...View moreMetadata
Abstract:
The hidden Markov model (HMM) is widely popular as the de facto tool for representing temporal data; in this paper, we add to its utility in the sequence clustering domain - we describe a novel approach that allows us to directly control purity in HMM-based clustering algorithms. We show that encouraging sparsity in the observation probabilities increases cluster purity and derive an algorithm based on l
p
regularization; as a corollary, we also provide a different and useful interpretation of the value of p in Renyi p-entropy. We test our method on the problem of clustering non-speech audio events from the BBC sound effects corpus. Experimental results confirm that our approach does learn purer clusters, with (unweighted) average purity as high as 0.88 - a considerable improvement over both the baseline HMM (0.72) and k-means clustering (0.69).
Date of Conference: 26-31 May 2013
Date Added to IEEE Xplore: 21 October 2013
Electronic ISBN:978-1-4799-0356-6