Abstract
Dialogue control for health-oriented smartwatch apps is a multi-dimensional task. In our application scenario, the intended purpose of the smartwatch app is the prevention and detection of health hazards jeopardizing the smartwatch wearer (e.g. exsiccosis because of insufficient drinking); the designated target group of the app are elderly people. The dimension of a potential simultaneity of health hazards and ethical considerations how to position the wearer always in control of the app have been presented before. In this paper we focus on the third dimension of the mandatory acceptance conditions of the app. The intended assistance functionality of the app can be only realized, if the interventions of the app occur only in daily life situations, when the wearer will accept such interventions. We present a machine learning approach, by which the app will learn from the wearer over time, when such interventions are appropriate and accepted - and when the app will be expected to remain silent. Of course, this decision has to take into account also the urgency of the intervention with respect to the severity of the threating health hazard.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Acceptance patterns for smartwatches
- Machine learning
- Assistance for the elderly
- Health hazard handling
- Ambient assisted living
1 Introduction
For supporting a safe, healthy and self-determined life of elderly people in their familiar home setting, the use of assisting non-stigmatizing information technology is widely accepted. Smartwatches are suitable devices, because (i) they will be typically worn from dawn to dusk, inside the home and on the road, (ii) upfront models (e.g. Apple Watch™ 3 and later, Samsung Galaxy Watch Active2™) include LTE communication modules for autonomous alerting, and (iii) they can be programmed via apps.
Our smartwatch assistance app [1,2,3] monitors about a dozen health hazards simultaneously typically in the background, whenever the smartwatch is worn. The monitoring is based on the activity patterns of the smartwatch wearer, observed vital parameters (pulse) and the location of the smartwatch wearer. Whenever the smartwatch app concludes about a threatening present health hazard, it initiates a dialogue with the user in order to prevent an emergency service while no factual health hazard occurred (false alert), or, to motivate the wearer to start short-time counter measures (in the described examples: to drink something), or, to include external human help by placing a phone call via the integrated mobile radio within the smartwatch (e.g. call to a home emergency call center, if no liquids are reachable for the user). We denote this dialogue as health hazard handling dialogue as constituent of a more comprehensive health hazard handling process [1]. For implementing a structured, complete and standardized execution of such health hazard handling dialogues, the concept of a critical dialogue section, CDS, has been proposed in [4].
The first problem dimension needed to be handled by the dialogue control is the potential simultaneity of health hazards. Typically, only one health hazard can be handled by an user dialogue at a time. This challenge will be solved first via a prioritization of health hazards, in that simultaneous health hazards will be discussed with the user following decreased severity. Second, via the concept CDS an ongoing, model-based dialogue for health hazard handling will never be disrupted by another handling dialogue for a higher prioritized hazard detected in the meantime [4]. This stringent approach is feasible because an individual handling dialogue takes only a very few minutes.
The second problem dimension of dialogue control is governed by ethical considerations. The smartwatch wearer shall be always left in control, whether he wants entering a dialogue with the smartwatch app and which of his or her person-related information, especially vital data and/or movement profiles, will be disclosed to third parties. Whenever the smartwatch assistance app – within the course of a health hazard handling dialogue - calls in for external human help, such help will be only effective if the actual person related information from the smartwatch user will be made available to an external human specialist. It will therefore be transferred automatically by the app just before the external call will be established. This means, that within the course of the user dialogue, an explicit consent from the user will be mandatory for disclosing such person-related information and calling-in external human help [5]. The user consent is collected by executing a prealert on the smartwatch. The process flow for doing so has been modelled by declarative description of the complete health hazard handling process via UML state machines [1, 2, 4]. The only exception to such an indispensable user permit is a situation, in which the smartwatch app assumes a potentially life-threatening health hazard as well as an unconscious user.
The third problem dimension of dialog control is determined by the acceptance conditions of corresponding apps by the users. Our experiences from the implementation of smartwatch assistance apps based on the principles described above have shown that dialogue offers from the app will be accepted by the smartwatch wearer in some situations but will be regularly rejected in others. These aspects must be considered by a careful, user centered design. Otherwise the app will lose acceptance and will not be used due to perceived “false alerts”. We propose to solve this challenge by including a new machine learning component, which will observe and learn the situational parameters, in which proactive dialogues from the smartwatch will be accepted by its wearers.
2 Requirements Specification
A key problem is how the system can learn when an app-initiated health hazard handling dialogue will not be situationally appropriate from an user’s perspective (e.g. when driving a car at high speed or while talking to other persons). Especially in the scope of a situation, when the smartwatch wearer basically has no interest to disclose his or her specific reasons to the app. For acquiring this information, we have decided to use machine learning, specifically reinforcement learning. This learning will result in an individual adaption of the app behavior when to proactively entering dialogues with the smartwatch wearer.
In implementing this new approach, first of all a new unobtrusive (“shut up”) gesture had to be added to the smartwatch repertoire of recognized gesturesFootnote 1: slapping the opposite hand on the wrist where the smartwatch is worn in order to stop a situationally unwanted app initiated dialogue. Alternatively, a spoken command could be implemented (“not now”). Clearly, the execution of this shut up gesture resp. command indicates a negative reward to the learning algorithm. The immediate interrupt of an initiated CDSFootnote 2 is the maximum negative reward, the complete execution of such a section (and cooperative participation of the smartwatch wearer) is the maximum positive reward to the algorithm.
Based on this dialogue evaluation, the proposed learning algorithm initially has to learn from examples, when a health hazard handling dialogue has been allowed by the wearer in the past, and when not. To do so, potential situational parameters within the smartwatch’s sensorial horizon must be collected, which were present at the moment of allowance, interruption or denial of the dialogue. These situational parameters include the current geographic location of the smartwatch wearer, his/her movement speed, the time and day of the week, …, see Sect. 3 below for the details. The sum of these parameter values, the situational setting, is supposed to determine the acceptance of the dialogue in a specific situation.
Although, the event of allowing, interrupting or denying a health hazard handling dialogue will typically occur only infrequently in everyday life of the smartwatch user. A second essential requirement for the learning algorithm therefore is to generalize the experienced event and to transfer and apply it to comparable situations, when the same behavior of the algorithm will be probably expected from the user’s point of view. This generalization will be achieved by mapping the n parameter values of a situational setting into a data point within a n-dimensional data space, »hypercube« (Fig. 1). This hypercube will be populated not only with data points for each experienced health hazard dialog event. Additionally, whenever an EDL/ADL has been recognized by the assistance app, the situational setting of the EDL/ADL will be written as a data point to the hypercube. Based on experiences from our field test with the smartwatch app, in this way the hypercube will be populated with a about a new dozen data points for ADLs recognized per day and smartwatch wearer. Now, when a data point for an experienced health hazard dialog event will be added to the hypercube, we consider the uniform distribution of data points within the hypercube. If the data point for the experienced event is contained within a cluster in this hypercube based on the relative proximity of data points within the hypercube, the experienced behavior (decline or acceptance of the dialogue) will be extended to all elements within this cluster, if - and only if - this can be done in a non-contradictory way. The last restriction is of relevance, because the success of the learning process cannot be assured by technical means.
The constructed, more and more populated hypercube constitutes a very personal profile of what the wearer is typically doing when and where. This is extremely sensitive personal information, and will be deliberately stored exclusively in the smartwatch assistance app. The profile must not be calculated outside the smartwatch app and cannot be exported and/or transferred to other devices, in order to prevent any potential misuse (»privacy by design«, [6], Principle 3: “Privacy Embedded into Design”). This is a very basic requirement and boundary condition to the solution described below.
3 System Design
3.1 Suitability and Scope of Machine Learning
Machine learning (ML) is the most successful AI approach currently used. Machine vision is one example where convolutional neuronal network (CNNs) and deep learning [7] as an example of ML are extremely successful. Three types of ML approaches are used nowadays: supervised learning (SL) where the data set is trained against labels, unsupervised learning (UL) which identifies patterns in the data set without using labels and reinforcement learning (RL) where the learning algorithm receives feedback on its actions and learns from that feedback. A major condition of those approaches is that they depend on relatively big training data sets in order to provide correct results. RL may be to some extent different as the reinforcement can be done by algorithms (e.g. using game rules, [8]). Nevertheless, for our application domain health hazards handling dialogues – the examples to learn from - occur relatively rare.
Therefore, our approach is based on learning by examples ([9], chapter 18) and a maximal utilization of those examples for similar situational settings.
In order to determine the potential scope of machine learning, we have to consider the relation between the factual presence of a health hazard in reality and the identification resp. classification of the same situation as hazardous to the user’s health by the assistance app. The four possible combinations of values and standardized denominations for such binary value combinations are depicted in Table 1.
In principal, for TPs and FNs, the app behavior is ok, nothing needs to be improved. This statement is valid, as long as the user will always allow TP health hazard handling dialogues. But what, if the user also trains the app to keep silent in TP situations, because he wants to keep his peace of mind, ignoring the threatening or already manifested health hazard? Following the second problem dimension described above, the postulated dominance of control principle for the user would result in a reticence of the assistance app against better knowledge. This may be questionable with respect to health implications but is mandatory from ethical principles favoring the primacy of self-determination of the user.
FPs are covered by the proposed learning algorithms targeting to learn a user accepted communication behavior. For a rational user, which would always correctly decline FP health hazard handling dialogues, the learning algorithm would result in a flawless assistance app communication behavior.
For FN, the situation is complicated for two reasons. First of all, the proposed simple learning algorithm is incapable to learn or improve the necessary healthcare knowledge even for already known resp. hazards already managed by the assistance app. For improving this implemented behavior of the assistance app, either new SL training samples would be required to improve the artificial neuronal network executing the EDL/ADL recognition process within the app [2, 3]. Or, the declarative knowledge representation embodying the healthcare handling process itselfFootnote 3 based on such recognized EDLs, ADLs would have to be improved [1, 4]. Especially a full automatization of the structural knowledge acquisition for the latter case is currently out of scope [10, 11]. Furthermore, the assistance app is trained to conclude a fixed number of health hazards based on sensorial values, EDLs/ADLs recognized from them and their sequencing and combination in time. Thus, the app will not detect health hazards beyond the sensorial horizon of the sensors and/or conclusion principles applicable. But, these hazards would be also subsumed as FN. As an example, the app is not trained for detecting injuries form car accidents which obviously is a very relevant category of health hazards.
3.2 Modelling Situational Settings
The following parameters for characterizing a situational setting of (i) a recognized EDL/ADL, or (ii) the execution of a health hazard handling dialogue - via execution of the corresponding CDS - will be considered by the smartwatch assistance app:
-
The geographic location of the smartwatch wearer: (lat, long) acquired from the GPS sensor of the wearer outdoor. When being at home, a room based indoor localization can be achieved by utilizing the Wi-Fi signature of the specific room.
-
The specific time of the day (digitized in a 15 min grid).
-
The specific day of the week. The day of the week is additionally categorized either as (1) a regular workday (Monday to Friday), (2) a Saturday or private holiday, or (3) a Sunday or public holiday.
-
The speed by which the wearer is moving, digitized into four discrete speed intervals, like: steady, walking, running, driving.
The corresponding parameters values will be aggregated into a 4-dimensional data point. If this data point does not already exist in the hypercube, it will be added, when (i) a EADL/ADL has been recognized or (ii) a CDS has been executed. Let x denote this considered data point. Data point x will be associated with a set of values, denoted valsx. Initially, these values v ∈ valsx describe the events which caused the creation of the data point x. Later on, the values will be amended by the experiences learned for the situational setting data represented by x. The potential elements v in the value set valsx of data point x set can be:
-
1.
an atomic denominator ea for an recognized ADL/EDL.
-
2.
a triple (c, r, ea) describing the category c of a CDS executed (Table 2), the execution result r (Table 3) and eventually the perceived occasion of the CDS execution, given by the recognition of an EDL/ADL with denominator ea in close temporal proximity to the CDS execution. ea may be empty, special denominator nil, if no such EDL/ADL has been recognized in close temporal proximity to the execution of a CDS.
Table 2. Categories of CDS indicating the severity with respect to the involved health hazards Table 3. Possible result values for a CDS execution
For example, a recognized EDL »tumble« typically causes immediately the execution of a CDS, so it is useful to associate the CDS execution directly with the occasion of this recognized EDL. A similar close temporal proximity is typically between the ADL »runaway« and the CDS execution for handling the foreseeable health hazard resulting from the runaway situation. On the other site, an “insufficient drinking” health hazard handling dialogue takes place significantly later than the last recognized »drinking« ADL and independently from other EDL/ADLs. Therefore, it does not make sense to associate the data point for corresponding CDS execution with any EDL/ADL. Only, if an EDL/ADL incidentally would be recognized in close temporal proximity to the “insufficient drinking” dialogue execution, it would be added as the incidental occasion of the CDS execution. In the latter case, this really makes sense, because the corresponding EDL/ADL could have in fact influenced the specific user reaction to the “insufficient drinking” dialogue execution.
The designated purpose of the association of values with data points within the hypercube is to derive a recommendation for (dialogue) action behavior of the assistance app. It is therefore essential to construct unambiguous recommended actions. Therefore, if the value set valsx of data point x already contains a value vo = (c, r, ea) and a new value vn = (c, s, ea) shall be added to the value set with r = r1 ∧ s = r3 or vice versa, i.e. a contradictory recommendation for action, the new value vn will replace the existing value vo in the value set valsx. New experience replaces old experience. By incorporating EDLs/ADLs in the value tripel, whenever possible, we also reduce the reach of contradictory recommendations for (dialogue) action for the future app behavior.
The inclusion of only the category of a CDS executed instead of the specific health hazard handled by this CDS into a value element of a data point x will help to extend the validity of the specific example represented by this data point x. The learned example will be assumed to be applicable for all other data points y in the same cluster than x, executing a health hazard dialogue of the same category and on the occasion of the same EDL/ADL than for the example. If a CDS handles more than one health hazard at a time, the most severe health hazard category with respect to the Table 2 determines the categorization of the CDS.
3.3 Extending the Reach of Learned Examples
Shortly after a new data point x resulting from a CDS execution and with a specific value v = (cv, rv, eav) ∈ valsx due to the execution of this CDS has been added to the hypercube, the assignment of x to the already computed clusters of other data points will be done. The agglomerative clustering of data points within the hypercube (cf. [11], chapter 6.8) favorably will be done at night, when the smartwatch is typically not worn and recharged.
We use an easy computable manhattan metric [12] for data points defining the distance two data points as the sum of the absolute differences of the data points within each dimension of the parameter space: for the location we use a logarithmic euclidean distance, the time difference between the (relative) times of the day, the difference of suitable ordinal numbers for the day of week, and also for the speed intervals by which the smartwatch wearer is moving.
As soon as data point x has been assigned to a cluster, the learned dialogue behavior for the situational setting represented by x and encoded in the value v of x can be transferred and extended to all elements of the cluster. We thereby assume that the dialogue behavior of x will be also appropriate for “similar” situational settings. Such similar situational setting will be given by all elements belonging to the same cluster than x. Let y denote such a data point within the same cluster than x and let w = (cw, rw, eaw) denote an arbitrary tripled value element within the value set valsy of y.
Then the value transfer and extension process from x to y is specified by the following rules:
-
If valsy does not contain value elements in the form of triples, value v of x is added to valsy. [Data point y did not contain any dialogue control behavior, which will be added hereby.]
-
If, for all values w ∈ valsy, category cv is different from category cw, or, categories cv and cw are the same and eav and eaw are different, again value v can be added to valsy. [In this case, the dialogue control behavior of v will be amended by the dialog control behavior of v.]
-
If, for a value w ∈ valsy, cv = cw and eav = eaw, but rv ≠ rw, we have contradictory execution results for the same category of CDS and executed on the same occasion. We need a graceful, non-contradictory local adaption of x to its neighborhood in the cluster. Therefore, let y now denote such a cluster element in defined maximal proximity to x with respect to the 4-dimensional parameter space and a value w as specified above.
-
If rw = r2, then rw := rv. In this way, the stratified execution result rv = r1 ∨ r3 will replace the - so far - ambiguous execution result r2 in the neighborhood of the data point x, because the value combinations r1, r2 and r2, r3 are estimated as non-contradictory.
-
If rv = r1 and rw = r3, or vice versa, contradictory execution results, then rw := r2. Thus, we lessen the contradiction in the neighborhood of the new data point x.
If the new data point x cannot be added to a cluster, x remains an isolated, non-clustered data point in the hypercube, and due to its isolation an extension of the reach of the learned example seems inappropriate.
3.4 Applying Learned Experience to New Health Hazard Handling Dialogues
Whenever a new health hazard handling dialogue shall be started and a corresponding CDS has been selected for execution, the acquired values of data points within the hypercube will be used as a recommendation for action. First of all, the situational setting of the CDS to be executed will be determined as a data point x in the hypercube. If x does not contain any triple value in its value set valsx, there is no learned experience for controlling the execution of the CDS, the execution of the CDS can start.
Otherwise, we have to check within the value set valsx of x whether there is applicable learned experience for the execution of the CDS. First of all, we need to determine the category of the CDS with respect to Table 2, let c denote this category. Then, if the value set of x does contain a value triple v = (c, r, eav), and the CDS would be executed on the occasion of an EDL/ADL ea which has been recognized in closed temporal proximity to the scheduled CDS execution, and ea = eav, this triple v contains the learned experience for the execution of the CDS.
Now c and r will be used for a lookup in Table 4 on how to proceed with the execution of the CDS. A “retry execution” command means that for the selected CDS the same procedure as described above in this Sect. 3.4 will be repeated at the designated point of time in the future. Although, there is no guarantee that the selected CDS will be executed at that point of time in the future. The CDS might compete at that time with other concluded health hazards, which will have a higher priority on the blackboard described in [4]. In such a case one of those higher prioritized CDS will be selected for execution by the blackboard scheduler algorithm described in [4].
4 Discussion
Up to now, it is an open question what is the actual decisive factor for the successful and complete interaction flow for a health hazard handling dialogue via CDS execution? Our hypothesis, implemented in the presented approach, is that this factor is the occasion on which the CDS is executed. Alternatively, the decisive factor might be also the real cause of the health hazard handled by the dialogue. These alternatives need to be further explored and verified for optimizing the future app behavior.
The effectiveness of the proposed learning algorithm presupposes a “rational” smartwatch wearer, who deliberately and consistently accepts and rejects health hazard handling dialogues for TP and/or FP situations. If this consistency will be not the case - or the sensorial horizon of the smartwatch would be incomplete with respect to the actual acceptance pattern of the smartwatch app wearer -, this would result in even the same or nearby data points in the parameter hypercube with contradictory values. No extension of the reach of learned experience to similar situational settings within a cluster will take place. As a consequence, the algorithm will never improve its conversational behavior and acceptance from the smartwatch wearer’s perspective. For example, if the acceptance of health hazard handling dialogues would be dependent of the presence of the smartwatch wearer’s companion, because the smartwatch wearer wants not to be disclosed as being dependent on technical aids in the presence of other persons, the learning would not work at all. The smartwatch app would never be capable to detect the presence of other persons by its current sensors, and thus would not be possible to include this decisive parameter in its situational settings.
Unfortunately, an automatic improvement of the app’s behavior for FN situations seems not realistic for the foreseeable future without significant scientific breakthroughs.
Another point which needs to be handled by future work is if the user wants to reactivate suppressed alerts. The current descriptions, esp. for low risks, would result once an alert is suppressed it will be suppressed forever and therefore no deviant behavior could be ever learned in the future. A relaxation approach, by which the learned experience (suppression) will be “forgotten” in the course of time, or a specific maintenance tool for the hypercube, could be effective remedies.
5 Conclusions
As a result, our experiment demonstrates that the acceptance of health-based smartwatch apps can be discernibly improved for the anticipated target group of elderly persons, if the app’s behavior respects the favored individual usage patterns. Such behavioral patterns can be automatically acquired by reinforcement machine learning during the (initial) usage of the app with economic effort and in presence of a rational, consistently acting user.
Notes
- 1.
These gestures include: drinking, eating, hand washing, run_away, sleeping/snoozing, steering (a vehicle/bicycle), teeth brushing, tumbling, and, of course, the »unclassified« gesture [2].
- 2.
It should be noted that the critical dialogue section concept proposed in [4] is asymmetric in its nature: by definition, a critical dialogue section will be always executed completely by the smartwatch app, as soon as it has started. But, the smartwatch wearer, user, is free to interrupt the execution of the section by application of the “shut up” gesture or command at any time.
- 3.
Currently described via an extended notion of UML finite state machines.
References
Lutze, R., Waldhör, K.: A smartwatch software architecture for health hazard handling for elderly people. In: 3rd IEEE International Conference on HealthCare Informatics (ICHI), Dallas, USA, 21–23 October, pp. 356–361 (2015)
Lutze, R., Waldhör, K.: Personal health assistance for elderly people via smartwatch based motion analysis. In: IEEE International Conference on Healthcare Informatics (ICHI), Park City, UT, USA, 23–26 August, pp. 124–133 (2017)
Lutze, R., Waldhör, K.: Utilizing Smartwatches for Supporting the Wellbeing of Elderly People. 2nd International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing (HealthInfo), Athens, Greece, 10–12 October, pp. 1–9 (2017)
Lutze, R., Waldhör, K.: Model based dialogue control for smartwatches. In: Kurosu, M. (ed.) HCI 2017. LNCS, vol. 10272, pp. 225–239. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58077-7_18
Lutze, R.: Practicality of smartwatch apps for supporting elderly people – a comprehensive survey. In: 24th ICE/IEEE International Technology Management Conference (ITMC), Stuttgart, Germany, 17–20 June, pp. 427–433 (2018)
Cavoukian, A.: Privacy by design - the 7 foundational principles – implementation and mapping of fair information practices. http://dataprotection.industries/wp-content/uploads/2017/10/privacy-by-design.pdf. Accessed 28 Jan 2020
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2017)
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2018)
Russel, S., Norvig, P.: Artificial Intelligence – A Modern Approach, 3rd edn. Pearson Education Limited, Harlow, Essex (2016)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining – Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers/Elsevier, Burlington (2011)
https://en.wikipedia.org/wiki/Taxicab_geometry. Redirected from “manhattan metric”. Accessed 28 Jan 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Lutze, R., Waldhör, K. (2020). Improving Dialogue Design and Control for Smartwatches by Reinforcement Learning Based Behavioral Acceptance Patterns. In: Kurosu, M. (eds) Human-Computer Interaction. Human Values and Quality of Life. HCII 2020. Lecture Notes in Computer Science(), vol 12183. Springer, Cham. https://doi.org/10.1007/978-3-030-49065-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-49065-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49064-5
Online ISBN: 978-3-030-49065-2
eBook Packages: Computer ScienceComputer Science (R0)