Abstract
Human Activity Recognition (HAR) has a growing research interest due to the widespread presence of motion sensors on user’s personal devices. The performance of HAR system deployed on large-scale is often significantly lower than reported due to the sensor-, device-, and person-specific heterogeneities. In this work, we develop a new approach for clustering such heterogeneous data, represented as a time series, which incorporates different level of heterogeneities in the data within the model. Our method is to represent the heterogeneities as a hierarchy where each level in the hierarchy overcomes a specific heterogeneity (e.g., a sensor-specific heterogeneity). Experimental evaluation on Electromyography (EMG) sensor dataset with heterogeneities shows that our method performs favourably compared to other time series clustering approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Graupe, D., Cline, W.K.: Functional separation of emg signals via arma identification methods for prosthesis control purposes. IEEE Trans. Syst. Man Cybern. 5(2), 252–259 (1975)
Griffin, L.Y., Albohm, M.J., Arendt, E.A., Bahr, R., Beynnon, B.D., DeMaio, M., Dick, R.W., Engebretsen, L., Garrett, W.E., Hannafin, J.A., et al.: Understanding and preventing noncontact anterior cruciate ligament injuries: a review of the Hunt Valley II meeting, January 2005. Am. J. Sports Med. 34(9), 1512–1532 (2006)
Gupta, P., Dallas, T.: Feature selection and activity recognition system using a single triaxial accelerometer. IEEE Trans. Biomed. Eng. 61(6), 1780–1786 (2014)
Huang, H., Kuiken, T., Lipschutz, R.D., et al.: A strategy for identifying locomotion modes using surface electromyography. IEEE Trans. Biomed. Eng. 56(1), 65–73 (2009)
Huang, H., Zhang, F., Hargrove, L.J., Dou, Z., Rogers, D.R., Englehart, K.B.: Continuous locomotion-mode identification for prosthetic legs based on neuromuscular-mechanical fusion. IEEE Trans. Biomed. Eng. 58(10), 2867–2875 (2011)
Ishwaran, H., James, L.F.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96(453), 161–173 (2001)
Ishwaran, H., Zarepour, M.: Markov chain monte carlo in approximate dirichlet and beta two-parameter process hierarchical models. Biometrika 87(2), 371–390 (2000)
Kumar, D.K., Pah, N.D., Bradley, A.: Wavelet analysis of surface electromyography. IEEE Trans. Neural Syst. Rehabil. Eng. 11(4), 400–406 (2003)
Liao, T.W.: Clustering of time series data–a survey. Pattern Recogn. 38(11), 1857–1874 (2005)
Lindley, D.V., Smith, A.F.: Bayes estimates for the linear model. J. Roy. Stat. Soc.: Ser. B (Methodol.) 34, 1–41 (1972)
Montero, P., Vilar, J.A.: TSclust: An R package for time series clustering. J. Stat. Softw. 62(1), 1–43 (2014). http://www.jstatsoft.org/v62/i01/
Nieto-Barajas, L.E., Contreras-Cristan, A.: A bayesian nonparametric approach for time series clustering. Bayesian Anal. 9(1), 147–170 (2014)
Peeraer, L., Aeyels, B., Van der Perre, G.: Development of emg-based mode and intent recognition algorithms for a computer-controlled above-knee prosthesis. J. Biomed. Eng. 12(3), 178–182 (1990)
Reaz, M., Hussain, M., Mohd-Yasin, F.: Techniques of EMG signal analysis: detection, processing, classification and applications. Biolog. Proc. Online 8(1), 11–35 (2006)
Stisen, A., Blunck, H., Bhattacharya, S., Prentow, T.S., Kjærgaard, M.B., Dey, A., Sonne, T., Jensen, M.M.: Smart devices are different: assessing and mitigating mobile sensing heterogeneities for activity recognition. In: Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pp. 127–140 (2015)
Acknowledgements
This work is partially supported by the NIH grant R01GM103309. We acknowledge Deepak Joshi and Michel Kinsy for their inputs.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A Posterior Characterization
Appendix A Posterior Characterization
-
\(\alpha _i\): The posterior for \(\alpha _i\) is:
$$\begin{aligned} \begin{aligned} f(\alpha _i | rest) \propto N_P(\mu _a, \varSigma _a)\\ \varSigma _a = (\varSigma _{\alpha }^{-1} + Z^T\varSigma _{y}^{-1}Z)^{-1}\\ \mu _a = \varSigma _a Z^T \varSigma _y^{-1} (y_i - X \beta _i - \theta _i)\\ f(\sigma _{\alpha _j}^2 | rest) = IGa(c_0^{\alpha } + \frac{n}{2}, c_1^{\alpha } + \frac{1}{2} \sum _{i=1}^n \alpha _{ij}^2), j = 1,..,p \end{aligned} \end{aligned}$$(8) -
\(\beta _i\): The posterior for \(\beta _i\) (or \(\beta _{s,r,k}\)) is:
$$\begin{aligned} \begin{aligned} f(\beta _i | rest) \propto N_D (\mu _b, \varSigma _b)\\ \varSigma _b = (\varSigma _{\beta ,s,r,k}^{-1} + X^T \varSigma _{y}^{-1}X )^{-1}\\ \mu _b = \varSigma _b [ X^T \varSigma _{y}^{-1} (y_i - Z \alpha _i - \theta _i) + \varSigma _{\beta ,s,r,k} \overline{\beta _{s,r,k}} ]\\ f(\sigma _{\beta _{s,r,k,i}}^2 | rest) = IGa(c_0^{\beta _{s,r,k,i}} + \frac{m}{2}, c_1^{\beta _{s,r,k,i}} + \frac{1}{2} \sum _{j=1}^m \beta _{s,r,k,i}^2)\\ i = 1,..,p \end{aligned} \end{aligned}$$(9)where m is the number of data points belonging to that cluster.
-
\(\theta _i\): The posterior for \(\theta _i\) (or \(\theta _{s,r,k}\)) is:
$$\begin{aligned} \begin{aligned} f(\theta _i | rest) \propto N_T (\mu _c, \varSigma _c)\\ \varSigma _c = (\varSigma _{\theta ,s,r,k}^{-1} + \varSigma _{y}^{-1})^{-1}\\ \mu _c = \varSigma _c [\varSigma _{y}^{-1} (y_i - Z \alpha _i - X \beta _i) + \varSigma _{\theta ,s,r,k} \overline{\theta _{s,r,k}}]\\ f(\sigma _{\theta _{s,r,k,i}}^2 | rest) = IGa(c_0^{\theta _{s,r,k,i}} + \frac{m}{2}, c_1^{\theta _{s,r,k,i}} + \frac{1}{2} \sum _{j=1}^m \theta _{s,r,k,i}^2)\\ i = 1,..,T \end{aligned} \end{aligned}$$(10)where m is the number of data points belonging to that cluster.
-
\(\sigma _{\epsilon _i}^2\): The posterior for \(\sigma _{\epsilon _i}^2\) is:
$$\begin{aligned} \begin{aligned} f(\sigma _{\epsilon _i}^2 | rest) \propto IGa(c_0^{\epsilon } + \frac{T}{2}, c_1^{\epsilon } + \frac{1}{2} M_i^{'}M_i)\\ M_i = (y_i - Z \alpha _i - X \beta _i - \theta _i) \end{aligned} \end{aligned}$$(11) -
Level k posterior: The posterior for any level of hierarchy except for top-most level consists of following updates:
$$\begin{aligned} \begin{aligned} f(\beta _k | rest) \propto N_D(\mu _g, \varSigma _g)\\ \varSigma _g = (\varSigma _{\beta ,r,k}^{-1} + \varSigma _{\beta ,k}^{-1} )^{-1}\\ \mu _g = \varSigma _g (\varSigma _{\beta ,k}\overline{\beta _{k}} + \varSigma _{\beta ,r,k}^{-1} \beta _{r,k})\\ f(\sigma _{\beta _{k,i}}^2 | rest) = IGa(c_0^{\beta _{k,i}} \frac{R}{2}, c_1^{\beta _{k,i}} \sum _{j=1}^S \beta _{k,i}^2)\\ f(\theta _k | rest) \propto N_D(\mu _h, \varSigma _h)\\ \varSigma _h = (\varSigma _{\theta ,r,k}^{-1} + \varSigma _{\theta ,k}^{-1} )^{-1}\\ \mu _h = \varSigma _h (\varSigma _{\theta ,k}\overline{\theta _{k}} + \varSigma _{\theta ,r,k}^{-1} \theta _{r,k})\\ f(\sigma _{\theta _{k,i}}^2 | rest) = IGa(c_0^{\theta _{k,i}} \frac{R}{2}, c_1^{\theta _{k,i}} \sum _{j=1}^R \beta _{k,i}^2) \end{aligned} \end{aligned}$$(12) -
Top level posterior: The posterior at top-most level is:
$$\begin{aligned} \begin{aligned} f(\beta | rest) \propto N_D(\mu _e, \varSigma _e)\\ \varSigma _e = (\varSigma _{\beta ,k}^{-1} + \varSigma _{\beta }^{-1} )^{-1}\\ \mu _e = \varSigma _e (\varSigma _{\beta ,k}^{-1} \beta _{k})\\ f(\sigma _{\beta _{i}}^2 | rest) = IGa(c_0^{\beta _{i}} \frac{K}{2}, c_1^{\beta _{i}} \sum _{j=1}^K \beta _{i}^2)\\ f(\theta | rest) \propto N_D(\mu _f, \varSigma _f)\\ \varSigma _f = (\varSigma _{\theta ,k}^{-1} + \varSigma _{\theta }^{-1} )^{-1}\\ \mu _f = \varSigma _f (\varSigma _{\theta ,k}^{-1} \theta _{k})\\ f(\sigma _{\theta }^2 | rest) = IGa(\frac{KT}{2}, \frac{1}{2} \sum _{j=1}^K \theta _j^{'} Q^{-1}\theta _j)\\ f(\rho | rest) \propto |Q|^{-K/2} \exp {\frac{-1}{2 \sigma _{\theta }^2} \sum _{j=1}^K \theta _j^{'} Q^{-1}\theta _j} \frac{\sqrt{1 + \rho ^2}}{1 - \rho ^2} \end{aligned} \end{aligned}$$(13)where \(Q_{ij} = \rho ^{|i-j|}\) for \(i,j = 1,..,T\).
-
Posterior for GDD and p: The posterior for GDD is conjugate with multinomial sampling. The probability p is updated based on the fit of the data with respect to the individual clusters lowest level mean using the likelihood function. The complete detail for GDD posterior characterization can be found in [6].
This completes the posterior characterization of our approach.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kafle, S., Dou, D. (2016). A Heterogeneous Clustering Approach for Human Activity Recognition. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)