Skip to main content

Hidden Fatigue Detection for a Desk Worker Using Clustering of Successive Tasks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9425))

Abstract

To detect fatigue of a desk worker, this paper focuses on fatigue hidden in smiling and neutral faces and employs a periodic short time monitoring setting. In contrast to continual monitoring, the setting assumes that each short-time monitoring (in this paper, it is called a task) is conducted only during a break time. However, there are two problems: the small number of data in each task and the increasing number of tasks. To detect fatigue, the authors propose a method which is a combination of multi-task learning, clustering and anomaly detection. For the first problem, the authors employ multi-task learning which builds a specific classifier to each task efficiently by using information shared among tasks. Since clustering gathers similar tasks into a cluster, it mitigates the influence of the second problem. Experiments show that the proposed method exhibits a high performance in a long-term monitoring.

A part of this research was supported by JSPS KAKENHI 25280085 and 15K12100.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://turtlebot.com/.

  2. 2.

    The robot costs 649 euro and the docking station for the robot costs 45 euro (http://www.robotnikstore.com/robotnik/5121532/turtlebot-2.html, Aug. 24th, 2015).

  3. 3.

    In this phase, the robot stays at a docking station and recharges its battery.

  4. 4.

    In reality, we might need to modify our system so that the robot transits to the monitoring phase when ordered to do so by the desk worker.

  5. 5.

    http://www.microsoft.com/en-us/kinectforwindows/.

  6. 6.

    We use Kinect for Windows SDK v2.0 1409 (http://www.microsoft.com/en-us/kinectforwindows/develop/downloads-docs.aspx) to build the animation units.

  7. 7.

    We use the notebook PC Panasonic CF-SX3BDCBP (Core i7 4650U 2.29 GHz, RAM 16.0 GB).

References

  1. Brach, J.S., VanSwearingen, J.: Measuring fatigue related to facial muscle function. Arch. Phys. Med. Rehabil. 76(10), 905–908 (1995)

    Article  Google Scholar 

  2. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  3. Braver, T.S., Cohen, J.D., Nystrom, L.E., Jonides, J., Smith, E.E., Noll, D.C.: A parametric study of prefrontal cortex involvement in human working memory. Neuroimage 5(1), 49–62 (1997)

    Article  Google Scholar 

  4. Chen, J., Tang, L., Liu, J., Ye, J.: A convex formulation for learning shared structures from multiple tasks. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 137–144 (2009)

    Google Scholar 

  5. Comer, D.: The ubiquitous B-Tree. ACM Comput. Surv. 11(2), 121–137 (1979)

    Article  MATH  Google Scholar 

  6. Deguchi, Y., Suzuki, E.: Skeleton clustering by autonomous mobile robots for subtle fall risk discovery. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 500–505. Springer, Heidelberg (2014)

    Google Scholar 

  7. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. Hua, C., Zhang, Y.: Driver fatigue detection based on active facial features locating. J. Simul. 2(6), 335 (2014)

    Google Scholar 

  9. Jacob, L., Bach, F., Vert, J.-P.: Clustered multi-task learning: a convex formulation. Adv. Neural Inf. Process. Syst. 21, 745–752 (2009)

    Google Scholar 

  10. Ji, Q., Zhu, Z., Lan, P.: Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE T. Veh. Technol. 53(4), 1052–1068 (2004)

    Article  Google Scholar 

  11. Kapp, M.N., Sabourin, R., Maupin, P.: A dynamic model selection strategy for support vector machine classifiers. Appl. Soft Comput. 12(8), 2550–2565 (2012)

    Article  Google Scholar 

  12. Karnick, M.T., Muhlbaier, M.D., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: 19th International Conference on Pattern Recognition (ICPR 2008), pp. 1–4 (2008)

    Google Scholar 

  13. Kondo, R., Deguchi, Y., Suzuki, E.: Developing a face monitoring robot for a desk worker. In: Aarts, E., et al. (eds.) AmI 2014. LNCS, vol. 8850, pp. 226–241. Springer, Heidelberg (2014)

    Google Scholar 

  14. Kumar, A., Daumé III, H.: learning task grouping and overlap in multi-task learning. In: Proceedings of the 29th International Conference on Machine Learning (ICML 2012), pp. 1383–1390 (2012)

    Google Scholar 

  15. Ruvolo, P., Eaton, E.: ELLA: an efficient lifelong learning algorithm. In: Proceedings of the 30th International Conference on Machine Learning, (ICML 2013), pp. 507–515 (2013)

    Google Scholar 

  16. Takayama, D., Deguchi, Y., Takano, S., Scuturici, V.-M., Petit, J.-M., Suzuki, E.: Multi-view onboard clustering of skeleton data for fall risk discovery. In: Aarts, E., et al. (eds.) AmI 2014. LNCS, vol. 8850, pp. 258–273. Springer, Heidelberg (2014)

    Google Scholar 

  17. Tanaka, M., Mizuno, K., Yamaguti, K., Kuratsune, H., Fujii, A., Baba, H., Matsuda, K., Nishimae, A., Takesaka, T., Watanabe, Y.: Autonomic nervous alterations associated with daily level of fatigue. Behav. Brain Funct. 7, 46 (2011)

    Article  Google Scholar 

  18. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. discovery 1(2), 141–182 (1997)

    Article  Google Scholar 

  19. Zhou, J., Chen, J., Ye, J.: Clustered multi-task learning via alternating structure optimization. Adv. Neural Inf. Process. Syst. 24, 702–710 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yutaka Deguchi .

Editor information

Editors and Affiliations

Appendices

Appendix

A ELLA

ELLA [15] is a multi-task learning method using a parametric approach for a lifelong learning setting. Learning tasks \(Z^{(1)}, Z^{(2)}, ..., Z^{(T_\mathrm{max})}\) are observed sequentially, where \(T_\mathrm{max}\) is the number of tasks which is not known a priori. \(Z^{(t)}\) is represented as \((\hat{f}^{(t)}, \mathbf{X}^{(t)}, \mathbf{y}^{(t)})\), where \(\mathbf{X}^{(t)}\) is a set of examples, \(\mathbf{y}^{(t)}\) is a set of labels and \(\hat{f}^{(t)}\) is a hidden mapping from \(\mathbf{X}^{(t)}\) to \(\mathbf{y}^{(t)}\) in the task \(Z^{(t)}\). Its goal is to construct classifiers \(f^{(1)}, f^{(2)}, ..., f^{(T_\mathrm{max})}\) which approximate \(\hat{f}^{(t)}\)’s. The prediction function \(f^{(t)}(\mathbf{x}) = f(\mathbf{x};{\varvec{\theta }}^{(t)})\) is specific to the task \(Z^{(t)}\).

The model of ELLA is based on the GO-MTL model [14]. The parameter vector \({\varvec{\theta }}^{(t)}\) is represented as a product of a matrix \(\mathbf{L}\) and the weight vector \(\mathbf{s}^{(t)}\), where \(\mathbf{L}\) consists of k latent model components. The problem of minimizing the predictive loss over all tasks is realized by the minimization of the following objective function.

$$\begin{aligned} e_T(\mathbf{L})=\frac{1}{T}\min _{\mathbf{s}^{(t)}}\left\{ \frac{1}{n_t}\sum ^{n_t}_{i=1}\mathcal{L}\left( f\left( \mathbf{x}^{(t)}_i; \mathbf{L}\mathbf{s}^{(t)}\right) , y^{(t)}_i\right) +\mu \left\| \mathbf{s}^{(t)}\right\| _1\right\} +\lambda \Vert \mathbf{L}\Vert ^2_\mathrm{F}.\end{aligned}$$
(A.1)

where \(n_t\) is the number of examples in the task \(Z^{(t)}\) and \(\mathcal{L}\) is a loss function.

figure c

ELLA provides two approximations to optimize the objective function \(e_T(\mathbf{L})\) efficiently and to proceed incrementally with respect to tasks. The first approximation uses the second-order Taylor expansion of \(\frac{1}{n_t}\sum ^{n_t}_{i=1}\mathcal{L}(f(\mathbf{x}^{(t)}_i; \mathbf{L}\mathbf{s}^{(t)}), y^{(t)}_i)\) around \({\varvec{\theta }} = {\varvec{\theta }}^{(t)}_\mathrm{STL}\), where \({\varvec{\theta }}^{(t)}_\mathrm{STL}\) is an optimal classifier learnt on only the training data for task \(Z^{(t)}\). Using this approximation, the objective function does not depend on all of the previous training data through the inner summation.

The second approximation modifies the formulation to remove the minimization over \(\mathbf{s}^{(t)}\). The previous tasks benefit from new tasks through modified \(\mathbf{L}\) instead of updated \(\mathbf{s}^{(t)}\). This choice to update \(\mathbf{s}^{(t)}\) only when the task \(Z^{(t)}\) is observed does not practically affect the quality of model fitting as the number of tasks grows large.

The overview of ELLA is shown in the Algorithm 3. T is the number of the observed tasks. At first, the parameter \({\varvec{\theta }}^{(t)}_\mathrm{STL}\) and the Hessian \(\mathbf{D}^{(t)}\) of the loss function \(\mathcal{L}\) are calculated from the examples \(\mathbf{X}^{(t)}\) and their labels \(\mathbf{y}^{(t)}\). Before the calculation of \(\mathbf{s}^{(t)}\), zero columns of \(\mathbf{L}\) are reinitialized by using normal random numbers. In the line 4, \(\mathbf{s}^{(t)}\) is calculated using lasso model fit with least angle regression (LARS) [7]. The matrix \(\mathbf{A}_t\) and the vector \(\mathbf{b}_t\) are used to update \(\mathbf{L}\). This algorithm proceeds incrementally with respect to tasks.

The selection of \(\mathbf{s}^{(t)}\) depends on \(\mathbf{L}\). Before the first k tasks are learnt, some columns of \(\mathbf{L}\) are invalid because they indicate initial values. The option “initializeWithFirstKTasks’’ decides how to choose \(\mathbf{s}^{(t)}\) and \(\mathbf{L}\) in the first k tasks. When the option is valid, \(\mathbf{s}^{(t)} \ (t \le k)\) becomes the unit vector in the direction of the t-th dimension. In this case, the t-th column of \(\mathbf{L}\) is initialized to be similar to \({\varvec{\theta }}^{(t)}_\mathrm{STL}\) in the line 2 in the Algorithm 3. Otherwise, all elements of \(\mathbf{L}\) are initialized by normal random numbers and \(\mathbf{s}^{(t)} \ (t \le k)\) is selected by the line 4 in the Algorithm 3.

B BIRCH

BIRCH [18] is a distance-based incremental clustering method. BIRCH consists of two main phases and two optional phases, though the latter two phases are out of the scope in this paper. The main phases of BIRCH are the construction of the clustering feature (CF) tree and the generation of the clusters. The CF tree gathers similar examples into its leaf which is called a subcluster and clusters are generated by clustering of the subclusters.

The CF tree is an index structure similar to B+ tree [5], which consists of CFs. For a set of examples \(\mathcal{E}=\{\mathbf{e}_1, \mathbf{e}_2, ..., \mathbf{e}_N\}\), the CF of \(\mathcal{E}\) is a triplet \((N, \mathbf{a}, b)=(N, \sum ^N_{i=1}\mathbf{e}_i, \sum ^N_{i=1}\Vert \mathbf{e}_i\Vert ^2)\). By using CFs, important statistics [18] about clusters such as the centroid \(\overline{\mathbf{x}}\) and the diameter \(D(\mathcal{E})\) of a set of examples \(\mathcal{E}\) and the average inter cluster distance \(D2(\mathcal{E}^{(k)},\mathcal{E}^{(m)})\) of two sets of examples \(\mathcal{E}^{(k)}\) and \(\mathcal{E}^{(m)}\) are computed accurately as follows.

$$\begin{aligned} \overline{\mathbf{x}}= & {} \frac{\sum ^N_{i=1}\mathbf{e}_i}{N} = \frac{\mathbf{a}}{N}\end{aligned}$$
(B.1)
$$\begin{aligned} D(\mathcal{E})= & {} \sqrt{\frac{\sum ^N_{i=1}\sum ^N_{j=1}(\mathbf{e}_i-\mathbf{e}_j)^2}{N(N-1)}}\nonumber \\= & {} \sqrt{\frac{2Nb-2\mathbf{a}^2}{N(N-1)}}\end{aligned}$$
(B.2)
$$\begin{aligned} D2(\mathcal{E}^{(k)},\mathcal{E}^{(m)})= & {} \sqrt{\frac{\sum ^{N^{(k)}}_{i=1}\sum ^{N^{(m)}}_{j=1}\left( \mathbf{e}^{(k)}_i-\mathbf{e}^{(m)}_j\right) ^2}{N^{(k)}N^{(m)}}}\nonumber \\= & {} \sqrt{\frac{N^{(m)}b^{(k)}+N^{(k)}b^{(m)}-2\mathbf{a}^{(k)}\cdot \mathbf{a}^{(m)}}{N^{(k)}N^{(m)}}} \end{aligned}$$
(B.3)

Note that CFs satisfy an additivity theorem, i.e., \((\mathrm{CF\ of\ } \mathcal{E}^{(k)})+(\mathrm{CF\ of\ } \mathcal{E}^{(m)})=(\mathrm{CF\ of\ } (\mathcal{E}^{(k)}\sqcup \mathcal{E}^{(m)}))\), where \((\mathcal{E}^{(k)}\sqcup \mathcal{E}^{(m)})\) is the merged set of examples in \(\mathcal{E}^{(k)}\) and \(\mathcal{E}^{(m)}\). This theorem enables merging two sets of examples with the information of their CFs without using the original examples.

To construct a CF tree, two parameters are used: the branching factor \(\beta \) for an internal node and the absorption threshold \(\tau \) for the diameter of a leaf. BIRCH searches the closest leaf of the CF tree from an input example and calculates the diameter of the CF which consists of the leaf and the example. When the diameter is less than \(\tau \), the example is absorbed into the leaf, otherwise, inserted in the CF tree as a new leaf, in a similar way as B+ tree [5]. In each leaf of the CF tree, similar examples are gathered, which is called a subcluster.

In the generation of the clusters phase, subclusters are clustered using a distance measure and a threshold \(\phi \). When the distance between two subclusters is less than \(\phi \), the subclusters are merged, otherwise, they are not merged. Since the number of subclusters is less than the number of examples, BIRCH is efficient for large datasets and real-time applications.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Deguchi, Y., Suzuki, E. (2015). Hidden Fatigue Detection for a Desk Worker Using Clustering of Successive Tasks. In: De Ruyter, B., Kameas, A., Chatzimisios, P., Mavrommati, I. (eds) Ambient Intelligence. AmI 2015. Lecture Notes in Computer Science(), vol 9425. Springer, Cham. https://doi.org/10.1007/978-3-319-26005-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26005-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26004-4

  • Online ISBN: 978-3-319-26005-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics