Hidden Fatigue Detection for a Desk Worker Using Clustering of Successive Tasks

Deguchi, Yutaka; Suzuki, Einoshin

doi:10.1007/978-3-319-26005-1_18

Hidden Fatigue Detection for a Desk Worker Using Clustering of Successive Tasks

Yutaka Deguchi¹⁷ &
Einoshin Suzuki¹⁷

Conference paper
First Online: 30 October 2015

1104 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9425))

Abstract

To detect fatigue of a desk worker, this paper focuses on fatigue hidden in smiling and neutral faces and employs a periodic short time monitoring setting. In contrast to continual monitoring, the setting assumes that each short-time monitoring (in this paper, it is called a task) is conducted only during a break time. However, there are two problems: the small number of data in each task and the increasing number of tasks. To detect fatigue, the authors propose a method which is a combination of multi-task learning, clustering and anomaly detection. For the first problem, the authors employ multi-task learning which builds a specific classifier to each task efficiently by using information shared among tasks. Since clustering gathers similar tasks into a cluster, it mitigates the influence of the second problem. Experiments show that the proposed method exhibits a high performance in a long-term monitoring.

A part of this research was supported by JSPS KAKENHI 25280085 and 15K12100.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://turtlebot.com/.
2.
The robot costs 649 euro and the docking station for the robot costs 45 euro (http://www.robotnikstore.com/robotnik/5121532/turtlebot-2.html, Aug. 24th, 2015).
3.
In this phase, the robot stays at a docking station and recharges its battery.
4.
In reality, we might need to modify our system so that the robot transits to the monitoring phase when ordered to do so by the desk worker.
5.
http://www.microsoft.com/en-us/kinectforwindows/.
6.
We use Kinect for Windows SDK v2.0 1409 (http://www.microsoft.com/en-us/kinectforwindows/develop/downloads-docs.aspx) to build the animation units.
7.
We use the notebook PC Panasonic CF-SX3BDCBP (Core i7 4650U 2.29 GHz, RAM 16.0 GB).

References

Brach, J.S., VanSwearingen, J.: Measuring fatigue related to facial muscle function. Arch. Phys. Med. Rehabil. 76(10), 905–908 (1995)
Article Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
Article Google Scholar
Braver, T.S., Cohen, J.D., Nystrom, L.E., Jonides, J., Smith, E.E., Noll, D.C.: A parametric study of prefrontal cortex involvement in human working memory. Neuroimage 5(1), 49–62 (1997)
Article Google Scholar
Chen, J., Tang, L., Liu, J., Ye, J.: A convex formulation for learning shared structures from multiple tasks. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 137–144 (2009)
Google Scholar
Comer, D.: The ubiquitous B-Tree. ACM Comput. Surv. 11(2), 121–137 (1979)
Article MATH Google Scholar
Deguchi, Y., Suzuki, E.: Skeleton clustering by autonomous mobile robots for subtle fall risk discovery. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 500–505. Springer, Heidelberg (2014)
Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)
Article MathSciNet MATH Google Scholar
Hua, C., Zhang, Y.: Driver fatigue detection based on active facial features locating. J. Simul. 2(6), 335 (2014)
Google Scholar
Jacob, L., Bach, F., Vert, J.-P.: Clustered multi-task learning: a convex formulation. Adv. Neural Inf. Process. Syst. 21, 745–752 (2009)
Google Scholar
Ji, Q., Zhu, Z., Lan, P.: Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE T. Veh. Technol. 53(4), 1052–1068 (2004)
Article Google Scholar
Kapp, M.N., Sabourin, R., Maupin, P.: A dynamic model selection strategy for support vector machine classifiers. Appl. Soft Comput. 12(8), 2550–2565 (2012)
Article Google Scholar
Karnick, M.T., Muhlbaier, M.D., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: 19th International Conference on Pattern Recognition (ICPR 2008), pp. 1–4 (2008)
Google Scholar
Kondo, R., Deguchi, Y., Suzuki, E.: Developing a face monitoring robot for a desk worker. In: Aarts, E., et al. (eds.) AmI 2014. LNCS, vol. 8850, pp. 226–241. Springer, Heidelberg (2014)
Google Scholar
Kumar, A., Daumé III, H.: learning task grouping and overlap in multi-task learning. In: Proceedings of the 29th International Conference on Machine Learning (ICML 2012), pp. 1383–1390 (2012)
Google Scholar
Ruvolo, P., Eaton, E.: ELLA: an efficient lifelong learning algorithm. In: Proceedings of the 30th International Conference on Machine Learning, (ICML 2013), pp. 507–515 (2013)
Google Scholar
Takayama, D., Deguchi, Y., Takano, S., Scuturici, V.-M., Petit, J.-M., Suzuki, E.: Multi-view onboard clustering of skeleton data for fall risk discovery. In: Aarts, E., et al. (eds.) AmI 2014. LNCS, vol. 8850, pp. 258–273. Springer, Heidelberg (2014)
Google Scholar
Tanaka, M., Mizuno, K., Yamaguti, K., Kuratsune, H., Fujii, A., Baba, H., Matsuda, K., Nishimae, A., Takesaka, T., Watanabe, Y.: Autonomic nervous alterations associated with daily level of fatigue. Behav. Brain Funct. 7, 46 (2011)
Article Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. discovery 1(2), 141–182 (1997)
Article Google Scholar
Zhou, J., Chen, J., Ye, J.: Clustered multi-task learning via alternating structure optimization. Adv. Neural Inf. Process. Syst. 24, 702–710 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, ISEE, Kyushu University, Fukuoka, Japan
Yutaka Deguchi & Einoshin Suzuki

Authors

Yutaka Deguchi
View author publications
You can also search for this author in PubMed Google Scholar
Einoshin Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yutaka Deguchi .

Editor information

Editors and Affiliations

Philips Research, Eindhoven, The Netherlands
Boris De Ruyter
Hellenic Open University, Patras, Greece
Achilles Kameas
Institute of Thessaloniki, Alexander Technological Educational, Sindos, Greece
Periklis Chatzimisios
Hellenic Open University, Patras, Greece
Irene Mavrommati

Appendices

Appendix

A ELLA

ELLA [15] is a multi-task learning method using a parametric approach for a lifelong learning setting. Learning tasks $Z^{(1)}, Z^{(2)}, ..., Z^{(T_\mathrm{max})}$ are observed sequentially, where $T_\mathrm{max}$ is the number of tasks which is not known a priori. $Z^{(t)}$ is represented as $(\hat{f}^{(t)}, \mathbf{X}^{(t)}, \mathbf{y}^{(t)})$, where $\mathbf{X}^{(t)}$ is a set of examples, $\mathbf{y}^{(t)}$ is a set of labels and $\hat{f}^{(t)}$ is a hidden mapping from $\mathbf{X}^{(t)}$ to $\mathbf{y}^{(t)}$ in the task $Z^{(t)}$. Its goal is to construct classifiers $f^{(1)}, f^{(2)}, ..., f^{(T_\mathrm{max})}$ which approximate $\hat{f}^{(t)}$’s. The prediction function $f^{(t)}(\mathbf{x}) = f(\mathbf{x};{\varvec{\theta }}^{(t)})$ is specific to the task $Z^{(t)}$.

The model of ELLA is based on the GO-MTL model [14]. The parameter vector ${\varvec{\theta }}^{(t)}$ is represented as a product of a matrix $\mathbf{L}$ and the weight vector $\mathbf{s}^{(t)}$, where $\mathbf{L}$ consists of k latent model components. The problem of minimizing the predictive loss over all tasks is realized by the minimization of the following objective function.

$$\begin{aligned} e_T(\mathbf{L})=\frac{1}{T}\min _{\mathbf{s}^{(t)}}\left\{ \frac{1}{n_t}\sum ^{n_t}_{i=1}\mathcal{L}\left( f\left( \mathbf{x}^{(t)}_i; \mathbf{L}\mathbf{s}^{(t)}\right) , y^{(t)}_i\right) +\mu \left\| \mathbf{s}^{(t)}\right\| _1\right\} +\lambda \Vert \mathbf{L}\Vert ^2_\mathrm{F}.\end{aligned}$$

(A.1)

where $n_t$ is the number of examples in the task $Z^{(t)}$ and $\mathcal{L}$ is a loss function.

ELLA provides two approximations to optimize the objective function $e_T(\mathbf{L})$ efficiently and to proceed incrementally with respect to tasks. The first approximation uses the second-order Taylor expansion of $\frac{1}{n_t}\sum ^{n_t}_{i=1}\mathcal{L}(f(\mathbf{x}^{(t)}_i; \mathbf{L}\mathbf{s}^{(t)}), y^{(t)}_i)$ around ${\varvec{\theta }} = {\varvec{\theta }}^{(t)}_\mathrm{STL}$, where ${\varvec{\theta }}^{(t)}_\mathrm{STL}$ is an optimal classifier learnt on only the training data for task $Z^{(t)}$. Using this approximation, the objective function does not depend on all of the previous training data through the inner summation.

The second approximation modifies the formulation to remove the minimization over $\mathbf{s}^{(t)}$. The previous tasks benefit from new tasks through modified $\mathbf{L}$ instead of updated $\mathbf{s}^{(t)}$. This choice to update $\mathbf{s}^{(t)}$ only when the task $Z^{(t)}$ is observed does not practically affect the quality of model fitting as the number of tasks grows large.

The overview of ELLA is shown in the Algorithm 3. T is the number of the observed tasks. At first, the parameter ${\varvec{\theta }}^{(t)}_\mathrm{STL}$ and the Hessian $\mathbf{D}^{(t)}$ of the loss function $\mathcal{L}$ are calculated from the examples $\mathbf{X}^{(t)}$ and their labels $\mathbf{y}^{(t)}$. Before the calculation of $\mathbf{s}^{(t)}$, zero columns of $\mathbf{L}$ are reinitialized by using normal random numbers. In the line 4, $\mathbf{s}^{(t)}$ is calculated using lasso model fit with least angle regression (LARS) [7]. The matrix $\mathbf{A}_t$ and the vector $\mathbf{b}_t$ are used to update $\mathbf{L}$. This algorithm proceeds incrementally with respect to tasks.

The selection of $\mathbf{s}^{(t)}$ depends on $\mathbf{L}$. Before the first k tasks are learnt, some columns of $\mathbf{L}$ are invalid because they indicate initial values. The option “initializeWithFirstKTasks’’ decides how to choose $\mathbf{s}^{(t)}$ and $\mathbf{L}$ in the first k tasks. When the option is valid, $\mathbf{s}^{(t)} \ (t \le k)$ becomes the unit vector in the direction of the t-th dimension. In this case, the t-th column of $\mathbf{L}$ is initialized to be similar to ${\varvec{\theta }}^{(t)}_\mathrm{STL}$ in the line 2 in the Algorithm 3. Otherwise, all elements of $\mathbf{L}$ are initialized by normal random numbers and $\mathbf{s}^{(t)} \ (t \le k)$ is selected by the line 4 in the Algorithm 3.

B BIRCH

BIRCH [18] is a distance-based incremental clustering method. BIRCH consists of two main phases and two optional phases, though the latter two phases are out of the scope in this paper. The main phases of BIRCH are the construction of the clustering feature (CF) tree and the generation of the clusters. The CF tree gathers similar examples into its leaf which is called a subcluster and clusters are generated by clustering of the subclusters.

The CF tree is an index structure similar to B+ tree [5], which consists of CFs. For a set of examples $\mathcal{E}=\{\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_N\}$, the CF of $\mathcal{E}$ is a triplet $(N, \mathbf{a}, b)=(N, \sum ^N_{i=1}\mathbf{e}_i, \sum ^N_{i=1}\Vert \mathbf{e}_i\Vert ^2)$. By using CFs, important statistics [18] about clusters such as the centroid $\overline{\mathbf{x}}$ and the diameter $D(\mathcal{E})$ of a set of examples $\mathcal{E}$ and the average inter cluster distance $D2(\mathcal{E}^{(k)},\mathcal{E}^{(m)})$ of two sets of examples $\mathcal{E}^{(k)}$ and $\mathcal{E}^{(m)}$ are computed accurately as follows.

$$\begin{aligned} \overline{\mathbf{x}}= & {} \frac{\sum ^N_{i=1}\mathbf{e}_i}{N} = \frac{\mathbf{a}}{N}\end{aligned}$$

(B.1)

$$\begin{aligned} D(\mathcal{E})= & {} \sqrt{\frac{\sum ^N_{i=1}\sum ^N_{j=1}(\mathbf{e}_i-\mathbf{e}_j)^2}{N(N-1)}}\nonumber \\= & {} \sqrt{\frac{2Nb-2\mathbf{a}^2}{N(N-1)}}\end{aligned}$$

(B.2)

$$\begin{aligned} D2(\mathcal{E}^{(k)},\mathcal{E}^{(m)})= & {} \sqrt{\frac{\sum ^{N^{(k)}}_{i=1}\sum ^{N^{(m)}}_{j=1}\left( \mathbf{e}^{(k)}_i-\mathbf{e}^{(m)}_j\right) ^2}{N^{(k)}N^{(m)}}}\nonumber \\= & {} \sqrt{\frac{N^{(m)}b^{(k)}+N^{(k)}b^{(m)}-2\mathbf{a}^{(k)}\cdot \mathbf{a}^{(m)}}{N^{(k)}N^{(m)}}} \end{aligned}$$

(B.3)

Note that CFs satisfy an additivity theorem, i.e., $(\mathrm{CF\ of\ } \mathcal{E}^{(k)})+(\mathrm{CF\ of\ } \mathcal{E}^{(m)})=(\mathrm{CF\ of\ } (\mathcal{E}^{(k)}\sqcup \mathcal{E}^{(m)}))$, where $(\mathcal{E}^{(k)}\sqcup \mathcal{E}^{(m)})$ is the merged set of examples in $\mathcal{E}^{(k)}$ and $\mathcal{E}^{(m)}$. This theorem enables merging two sets of examples with the information of their CFs without using the original examples.

To construct a CF tree, two parameters are used: the branching factor $\beta $ for an internal node and the absorption threshold $\tau $ for the diameter of a leaf. BIRCH searches the closest leaf of the CF tree from an input example and calculates the diameter of the CF which consists of the leaf and the example. When the diameter is less than $\tau $, the example is absorbed into the leaf, otherwise, inserted in the CF tree as a new leaf, in a similar way as B+ tree [5]. In each leaf of the CF tree, similar examples are gathered, which is called a subcluster.

In the generation of the clusters phase, subclusters are clustered using a distance measure and a threshold $\phi $. When the distance between two subclusters is less than $\phi $, the subclusters are merged, otherwise, they are not merged. Since the number of subclusters is less than the number of examples, BIRCH is efficient for large datasets and real-time applications.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deguchi, Y., Suzuki, E. (2015). Hidden Fatigue Detection for a Desk Worker Using Clustering of Successive Tasks. In: De Ruyter, B., Kameas, A., Chatzimisios, P., Mavrommati, I. (eds) Ambient Intelligence. AmI 2015. Lecture Notes in Computer Science(), vol 9425. Springer, Cham. https://doi.org/10.1007/978-3-319-26005-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-26005-1_18
Published: 30 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26004-4
Online ISBN: 978-3-319-26005-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics