Improving Dialogue Design and Control for Smartwatches by Reinforcement Learning Based Behavioral Acceptance Patterns

Lutze, Rainer; Waldhör, Klemens

doi:10.1007/978-3-030-49065-2_6

Rainer Lutze⁹ &
Klemens Waldhör¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12183))

Included in the following conference series:

International Conference on Human-Computer Interaction

6811 Accesses

Abstract

Dialogue control for health-oriented smartwatch apps is a multi-dimensional task. In our application scenario, the intended purpose of the smartwatch app is the prevention and detection of health hazards jeopardizing the smartwatch wearer (e.g. exsiccosis because of insufficient drinking); the designated target group of the app are elderly people. The dimension of a potential simultaneity of health hazards and ethical considerations how to position the wearer always in control of the app have been presented before. In this paper we focus on the third dimension of the mandatory acceptance conditions of the app. The intended assistance functionality of the app can be only realized, if the interventions of the app occur only in daily life situations, when the wearer will accept such interventions. We present a machine learning approach, by which the app will learn from the wearer over time, when such interventions are appropriate and accepted - and when the app will be expected to remain silent. Of course, this decision has to take into account also the urgency of the intervention with respect to the severity of the threating health hazard.

You have full access to this open access chapter, Download conference paper PDF

Reinforcement Learning Applications in Health Informatics

A Reinforcement Learning Based Intelligent System for the Healthcare Treatment Assistance of Patients with Disabilities

Dialogue management in conversational agents through psychology of persuasion and machine learning

Article 27 June 2020

Keywords

1 Introduction

For supporting a safe, healthy and self-determined life of elderly people in their familiar home setting, the use of assisting non-stigmatizing information technology is widely accepted. Smartwatches are suitable devices, because (i) they will be typically worn from dawn to dusk, inside the home and on the road, (ii) upfront models (e.g. Apple Watch™ 3 and later, Samsung Galaxy Watch Active2™) include LTE communication modules for autonomous alerting, and (iii) they can be programmed via apps.

Our smartwatch assistance app [1,2,3] monitors about a dozen health hazards simultaneously typically in the background, whenever the smartwatch is worn. The monitoring is based on the activity patterns of the smartwatch wearer, observed vital parameters (pulse) and the location of the smartwatch wearer. Whenever the smartwatch app concludes about a threatening present health hazard, it initiates a dialogue with the user in order to prevent an emergency service while no factual health hazard occurred (false alert), or, to motivate the wearer to start short-time counter measures (in the described examples: to drink something), or, to include external human help by placing a phone call via the integrated mobile radio within the smartwatch (e.g. call to a home emergency call center, if no liquids are reachable for the user). We denote this dialogue as health hazard handling dialogue as constituent of a more comprehensive health hazard handling process [1]. For implementing a structured, complete and standardized execution of such health hazard handling dialogues, the concept of a critical dialogue section, CDS, has been proposed in [4].

The first problem dimension needed to be handled by the dialogue control is the potential simultaneity of health hazards. Typically, only one health hazard can be handled by an user dialogue at a time. This challenge will be solved first via a prioritization of health hazards, in that simultaneous health hazards will be discussed with the user following decreased severity. Second, via the concept CDS an ongoing, model-based dialogue for health hazard handling will never be disrupted by another handling dialogue for a higher prioritized hazard detected in the meantime [4]. This stringent approach is feasible because an individual handling dialogue takes only a very few minutes.

The second problem dimension of dialogue control is governed by ethical considerations. The smartwatch wearer shall be always left in control, whether he wants entering a dialogue with the smartwatch app and which of his or her person-related information, especially vital data and/or movement profiles, will be disclosed to third parties. Whenever the smartwatch assistance app – within the course of a health hazard handling dialogue - calls in for external human help, such help will be only effective if the actual person related information from the smartwatch user will be made available to an external human specialist. It will therefore be transferred automatically by the app just before the external call will be established. This means, that within the course of the user dialogue, an explicit consent from the user will be mandatory for disclosing such person-related information and calling-in external human help [5]. The user consent is collected by executing a prealert on the smartwatch. The process flow for doing so has been modelled by declarative description of the complete health hazard handling process via UML state machines [1, 2, 4]. The only exception to such an indispensable user permit is a situation, in which the smartwatch app assumes a potentially life-threatening health hazard as well as an unconscious user.

The third problem dimension of dialog control is determined by the acceptance conditions of corresponding apps by the users. Our experiences from the implementation of smartwatch assistance apps based on the principles described above have shown that dialogue offers from the app will be accepted by the smartwatch wearer in some situations but will be regularly rejected in others. These aspects must be considered by a careful, user centered design. Otherwise the app will lose acceptance and will not be used due to perceived “false alerts”. We propose to solve this challenge by including a new machine learning component, which will observe and learn the situational parameters, in which proactive dialogues from the smartwatch will be accepted by its wearers.

2 Requirements Specification

A key problem is how the system can learn when an app-initiated health hazard handling dialogue will not be situationally appropriate from an user’s perspective (e.g. when driving a car at high speed or while talking to other persons). Especially in the scope of a situation, when the smartwatch wearer basically has no interest to disclose his or her specific reasons to the app. For acquiring this information, we have decided to use machine learning, specifically reinforcement learning. This learning will result in an individual adaption of the app behavior when to proactively entering dialogues with the smartwatch wearer.

In implementing this new approach, first of all a new unobtrusive (“shut up”) gesture had to be added to the smartwatch repertoire of recognized gestures^{Footnote 1}: slapping the opposite hand on the wrist where the smartwatch is worn in order to stop a situationally unwanted app initiated dialogue. Alternatively, a spoken command could be implemented (“not now”). Clearly, the execution of this shut up gesture resp. command indicates a negative reward to the learning algorithm. The immediate interrupt of an initiated CDS^{Footnote 2} is the maximum negative reward, the complete execution of such a section (and cooperative participation of the smartwatch wearer) is the maximum positive reward to the algorithm.

Based on this dialogue evaluation, the proposed learning algorithm initially has to learn from examples, when a health hazard handling dialogue has been allowed by the wearer in the past, and when not. To do so, potential situational parameters within the smartwatch’s sensorial horizon must be collected, which were present at the moment of allowance, interruption or denial of the dialogue. These situational parameters include the current geographic location of the smartwatch wearer, his/her movement speed, the time and day of the week, …, see Sect. 3 below for the details. The sum of these parameter values, the situational setting, is supposed to determine the acceptance of the dialogue in a specific situation.

Although, the event of allowing, interrupting or denying a health hazard handling dialogue will typically occur only infrequently in everyday life of the smartwatch user. A second essential requirement for the learning algorithm therefore is to generalize the experienced event and to transfer and apply it to comparable situations, when the same behavior of the algorithm will be probably expected from the user’s point of view. This generalization will be achieved by mapping the n parameter values of a situational setting into a data point within a n-dimensional data space, »hypercube« (Fig. 1). This hypercube will be populated not only with data points for each experienced health hazard dialog event. Additionally, whenever an EDL/ADL has been recognized by the assistance app, the situational setting of the EDL/ADL will be written as a data point to the hypercube. Based on experiences from our field test with the smartwatch app, in this way the hypercube will be populated with a about a new dozen data points for ADLs recognized per day and smartwatch wearer. Now, when a data point for an experienced health hazard dialog event will be added to the hypercube, we consider the uniform distribution of data points within the hypercube. If the data point for the experienced event is contained within a cluster in this hypercube based on the relative proximity of data points within the hypercube, the experienced behavior (decline or acceptance of the dialogue) will be extended to all elements within this cluster, if - and only if - this can be done in a non-contradictory way. The last restriction is of relevance, because the success of the learning process cannot be assured by technical means.

The constructed, more and more populated hypercube constitutes a very personal profile of what the wearer is typically doing when and where. This is extremely sensitive personal information, and will be deliberately stored exclusively in the smartwatch assistance app. The profile must not be calculated outside the smartwatch app and cannot be exported and/or transferred to other devices, in order to prevent any potential misuse (»privacy by design«, [6], Principle 3: “Privacy Embedded into Design”). This is a very basic requirement and boundary condition to the solution described below.

3 System Design

3.1 Suitability and Scope of Machine Learning

Machine learning (ML) is the most successful AI approach currently used. Machine vision is one example where convolutional neuronal network (CNNs) and deep learning [7] as an example of ML are extremely successful. Three types of ML approaches are used nowadays: supervised learning (SL) where the data set is trained against labels, unsupervised learning (UL) which identifies patterns in the data set without using labels and reinforcement learning (RL) where the learning algorithm receives feedback on its actions and learns from that feedback. A major condition of those approaches is that they depend on relatively big training data sets in order to provide correct results. RL may be to some extent different as the reinforcement can be done by algorithms (e.g. using game rules, [8]). Nevertheless, for our application domain health hazards handling dialogues – the examples to learn from - occur relatively rare.

Therefore, our approach is based on learning by examples ([9], chapter 18) and a maximal utilization of those examples for similar situational settings.

In order to determine the potential scope of machine learning, we have to consider the relation between the factual presence of a health hazard in reality and the identification resp. classification of the same situation as hazardous to the user’s health by the assistance app. The four possible combinations of values and standardized denominations for such binary value combinations are depicted in Table 1.

Table 1. Classification matrix for health hazards.

Full size table

In principal, for TPs and FNs, the app behavior is ok, nothing needs to be improved. This statement is valid, as long as the user will always allow TP health hazard handling dialogues. But what, if the user also trains the app to keep silent in TP situations, because he wants to keep his peace of mind, ignoring the threatening or already manifested health hazard? Following the second problem dimension described above, the postulated dominance of control principle for the user would result in a reticence of the assistance app against better knowledge. This may be questionable with respect to health implications but is mandatory from ethical principles favoring the primacy of self-determination of the user.

FPs are covered by the proposed learning algorithms targeting to learn a user accepted communication behavior. For a rational user, which would always correctly decline FP health hazard handling dialogues, the learning algorithm would result in a flawless assistance app communication behavior.

For FN, the situation is complicated for two reasons. First of all, the proposed simple learning algorithm is incapable to learn or improve the necessary healthcare knowledge even for already known resp. hazards already managed by the assistance app. For improving this implemented behavior of the assistance app, either new SL training samples would be required to improve the artificial neuronal network executing the EDL/ADL recognition process within the app [2, 3]. Or, the declarative knowledge representation embodying the healthcare handling process itself^{Footnote 3} based on such recognized EDLs, ADLs would have to be improved [1, 4]. Especially a full automatization of the structural knowledge acquisition for the latter case is currently out of scope [10, 11]. Furthermore, the assistance app is trained to conclude a fixed number of health hazards based on sensorial values, EDLs/ADLs recognized from them and their sequencing and combination in time. Thus, the app will not detect health hazards beyond the sensorial horizon of the sensors and/or conclusion principles applicable. But, these hazards would be also subsumed as FN. As an example, the app is not trained for detecting injuries form car accidents which obviously is a very relevant category of health hazards.

3.2 Modelling Situational Settings

The following parameters for characterizing a situational setting of (i) a recognized EDL/ADL, or (ii) the execution of a health hazard handling dialogue - via execution of the corresponding CDS - will be considered by the smartwatch assistance app:

The geographic location of the smartwatch wearer: (lat, long) acquired from the GPS sensor of the wearer outdoor. When being at home, a room based indoor localization can be achieved by utilizing the Wi-Fi signature of the specific room.
The specific time of the day (digitized in a 15 min grid).
The specific day of the week. The day of the week is additionally categorized either as (1) a regular workday (Monday to Friday), (2) a Saturday or private holiday, or (3) a Sunday or public holiday.
The speed by which the wearer is moving, digitized into four discrete speed intervals, like: steady, walking, running, driving.

The corresponding parameters values will be aggregated into a 4-dimensional data point. If this data point does not already exist in the hypercube, it will be added, when (i) a EADL/ADL has been recognized or (ii) a CDS has been executed. Let x denote this considered data point. Data point x will be associated with a set of values, denoted vals_x. Initially, these values v ∈ vals_x describe the events which caused the creation of the data point x. Later on, the values will be amended by the experiences learned for the situational setting data represented by x. The potential elements v in the value set vals_x of data point x set can be:

1.
an atomic denominator ea for an recognized ADL/EDL.
2.
a triple (c, r, ea) describing the category c of a CDS executed (Table 2), the execution result r (Table 3) and eventually the perceived occasion of the CDS execution, given by the recognition of an EDL/ADL with denominator ea in close temporal proximity to the CDS execution. ea may be empty, special denominator nil, if no such EDL/ADL has been recognized in close temporal proximity to the execution of a CDS.
Table 2. Categories of CDS indicating the severity with respect to the involved health hazards
Full size table

Table 3. Possible result values for a CDS execution
Full size table

For example, a recognized EDL »tumble« typically causes immediately the execution of a CDS, so it is useful to associate the CDS execution directly with the occasion of this recognized EDL. A similar close temporal proximity is typically between the ADL »runaway« and the CDS execution for handling the foreseeable health hazard resulting from the runaway situation. On the other site, an “insufficient drinking” health hazard handling dialogue takes place significantly later than the last recognized »drinking« ADL and independently from other EDL/ADLs. Therefore, it does not make sense to associate the data point for corresponding CDS execution with any EDL/ADL. Only, if an EDL/ADL incidentally would be recognized in close temporal proximity to the “insufficient drinking” dialogue execution, it would be added as the incidental occasion of the CDS execution. In the latter case, this really makes sense, because the corresponding EDL/ADL could have in fact influenced the specific user reaction to the “insufficient drinking” dialogue execution.

The designated purpose of the association of values with data points within the hypercube is to derive a recommendation for (dialogue) action behavior of the assistance app. It is therefore essential to construct unambiguous recommended actions. Therefore, if the value set vals_x of data point x already contains a value v_o = (c, r, ea) and a new value v_n = (c, s, ea) shall be added to the value set with r = r₁ ∧ s = r₃ or vice versa, i.e. a contradictory recommendation for action, the new value v_n will replace the existing value v_o in the value set vals_x. New experience replaces old experience. By incorporating EDLs/ADLs in the value tripel, whenever possible, we also reduce the reach of contradictory recommendations for (dialogue) action for the future app behavior.

The inclusion of only the category of a CDS executed instead of the specific health hazard handled by this CDS into a value element of a data point x will help to extend the validity of the specific example represented by this data point x. The learned example will be assumed to be applicable for all other data points y in the same cluster than x, executing a health hazard dialogue of the same category and on the occasion of the same EDL/ADL than for the example. If a CDS handles more than one health hazard at a time, the most severe health hazard category with respect to the Table 2 determines the categorization of the CDS.

3.3 Extending the Reach of Learned Examples

Shortly after a new data point x resulting from a CDS execution and with a specific value v = (c_v, r_v, ea_v) ∈ vals_x due to the execution of this CDS has been added to the hypercube, the assignment of x to the already computed clusters of other data points will be done. The agglomerative clustering of data points within the hypercube (cf. [11], chapter 6.8) favorably will be done at night, when the smartwatch is typically not worn and recharged.

We use an easy computable manhattan metric [12] for data points defining the distance two data points as the sum of the absolute differences of the data points within each dimension of the parameter space: for the location we use a logarithmic euclidean distance, the time difference between the (relative) times of the day, the difference of suitable ordinal numbers for the day of week, and also for the speed intervals by which the smartwatch wearer is moving.

As soon as data point x has been assigned to a cluster, the learned dialogue behavior for the situational setting represented by x and encoded in the value v of x can be transferred and extended to all elements of the cluster. We thereby assume that the dialogue behavior of x will be also appropriate for “similar” situational settings. Such similar situational setting will be given by all elements belonging to the same cluster than x. Let y denote such a data point within the same cluster than x and let w = (c_w, r_w, ea_w) denote an arbitrary tripled value element within the value set vals_y of y.

Then the value transfer and extension process from x to y is specified by the following rules:

If vals_y does not contain value elements in the form of triples, value v of x is added to vals_y. [Data point y did not contain any dialogue control behavior, which will be added hereby.]
If, for all values w ∈ vals_y, category c_v is different from category c_w, or, categories c_v and c_w are the same and ea_v and ea_w are different, again value v can be added to vals_y. [In this case, the dialogue control behavior of v will be amended by the dialog control behavior of v.]
If, for a value w ∈ vals_y, c_v = c_w and ea_v = ea_w, but r_v ≠ r_w, we have contradictory execution results for the same category of CDS and executed on the same occasion. We need a graceful, non-contradictory local adaption of x to its neighborhood in the cluster. Therefore, let y now denote such a cluster element in defined maximal proximity to x with respect to the 4-dimensional parameter space and a value w as specified above.
If r_w = r₂, then r_w := r_v. In this way, the stratified execution result r_v = r₁ ∨ r₃ will replace the - so far - ambiguous execution result r₂ in the neighborhood of the data point x, because the value combinations r₁, r₂ and r₂, r₃ are estimated as non-contradictory.
If r_v = r₁ and r_w = r₃, or vice versa, contradictory execution results, then r_w := r₂. Thus, we lessen the contradiction in the neighborhood of the new data point x.

If the new data point x cannot be added to a cluster, x remains an isolated, non-clustered data point in the hypercube, and due to its isolation an extension of the reach of the learned example seems inappropriate.

3.4 Applying Learned Experience to New Health Hazard Handling Dialogues

Whenever a new health hazard handling dialogue shall be started and a corresponding CDS has been selected for execution, the acquired values of data points within the hypercube will be used as a recommendation for action. First of all, the situational setting of the CDS to be executed will be determined as a data point x in the hypercube. If x does not contain any triple value in its value set vals_x, there is no learned experience for controlling the execution of the CDS, the execution of the CDS can start.

Otherwise, we have to check within the value set vals_x of x whether there is applicable learned experience for the execution of the CDS. First of all, we need to determine the category of the CDS with respect to Table 2, let c denote this category. Then, if the value set of x does contain a value triple v = (c, r, ea_v), and the CDS would be executed on the occasion of an EDL/ADL ea which has been recognized in closed temporal proximity to the scheduled CDS execution, and ea = ea_v, this triple v contains the learned experience for the execution of the CDS.

Now c and r will be used for a lookup in Table 4 on how to proceed with the execution of the CDS. A “retry execution” command means that for the selected CDS the same procedure as described above in this Sect. 3.4 will be repeated at the designated point of time in the future. Although, there is no guarantee that the selected CDS will be executed at that point of time in the future. The CDS might compete at that time with other concluded health hazards, which will have a higher priority on the blackboard described in [4]. In such a case one of those higher prioritized CDS will be selected for execution by the blackboard scheduler algorithm described in [4].

Table 4. Application of acquired experience for dialogue control within the assistance app

Full size table

4 Discussion

Up to now, it is an open question what is the actual decisive factor for the successful and complete interaction flow for a health hazard handling dialogue via CDS execution? Our hypothesis, implemented in the presented approach, is that this factor is the occasion on which the CDS is executed. Alternatively, the decisive factor might be also the real cause of the health hazard handled by the dialogue. These alternatives need to be further explored and verified for optimizing the future app behavior.

The effectiveness of the proposed learning algorithm presupposes a “rational” smartwatch wearer, who deliberately and consistently accepts and rejects health hazard handling dialogues for TP and/or FP situations. If this consistency will be not the case - or the sensorial horizon of the smartwatch would be incomplete with respect to the actual acceptance pattern of the smartwatch app wearer -, this would result in even the same or nearby data points in the parameter hypercube with contradictory values. No extension of the reach of learned experience to similar situational settings within a cluster will take place. As a consequence, the algorithm will never improve its conversational behavior and acceptance from the smartwatch wearer’s perspective. For example, if the acceptance of health hazard handling dialogues would be dependent of the presence of the smartwatch wearer’s companion, because the smartwatch wearer wants not to be disclosed as being dependent on technical aids in the presence of other persons, the learning would not work at all. The smartwatch app would never be capable to detect the presence of other persons by its current sensors, and thus would not be possible to include this decisive parameter in its situational settings.

Unfortunately, an automatic improvement of the app’s behavior for FN situations seems not realistic for the foreseeable future without significant scientific breakthroughs.

Another point which needs to be handled by future work is if the user wants to reactivate suppressed alerts. The current descriptions, esp. for low risks, would result once an alert is suppressed it will be suppressed forever and therefore no deviant behavior could be ever learned in the future. A relaxation approach, by which the learned experience (suppression) will be “forgotten” in the course of time, or a specific maintenance tool for the hypercube, could be effective remedies.

5 Conclusions

As a result, our experiment demonstrates that the acceptance of health-based smartwatch apps can be discernibly improved for the anticipated target group of elderly persons, if the app’s behavior respects the favored individual usage patterns. Such behavioral patterns can be automatically acquired by reinforcement machine learning during the (initial) usage of the app with economic effort and in presence of a rational, consistently acting user.

Notes

1.
These gestures include: drinking, eating, hand washing, run_away, sleeping/snoozing, steering (a vehicle/bicycle), teeth brushing, tumbling, and, of course, the »unclassified« gesture [2].
2.
It should be noted that the critical dialogue section concept proposed in [4] is asymmetric in its nature: by definition, a critical dialogue section will be always executed completely by the smartwatch app, as soon as it has started. But, the smartwatch wearer, user, is free to interrupt the execution of the section by application of the “shut up” gesture or command at any time.
3.
Currently described via an extended notion of UML finite state machines.

References

Lutze, R., Waldhör, K.: A smartwatch software architecture for health hazard handling for elderly people. In: 3rd IEEE International Conference on HealthCare Informatics (ICHI), Dallas, USA, 21–23 October, pp. 356–361 (2015)
Google Scholar
Lutze, R., Waldhör, K.: Personal health assistance for elderly people via smartwatch based motion analysis. In: IEEE International Conference on Healthcare Informatics (ICHI), Park City, UT, USA, 23–26 August, pp. 124–133 (2017)
Google Scholar
Lutze, R., Waldhör, K.: Utilizing Smartwatches for Supporting the Wellbeing of Elderly People. 2^nd International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing (HealthInfo), Athens, Greece, 10–12 October, pp. 1–9 (2017)
Google Scholar
Lutze, R., Waldhör, K.: Model based dialogue control for smartwatches. In: Kurosu, M. (ed.) HCI 2017. LNCS, vol. 10272, pp. 225–239. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58077-7_18
Chapter Google Scholar
Lutze, R.: Practicality of smartwatch apps for supporting elderly people – a comprehensive survey. In: 24th ICE/IEEE International Technology Management Conference (ITMC), Stuttgart, Germany, 17–20 June, pp. 427–433 (2018)
Google Scholar
Cavoukian, A.: Privacy by design - the 7 foundational principles – implementation and mapping of fair information practices. http://dataprotection.industries/wp-content/uploads/2017/10/privacy-by-design.pdf. Accessed 28 Jan 2020
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2017)
MATH Google Scholar
Sutton, R.S., Barto, A.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2018)
MATH Google Scholar
Russel, S., Norvig, P.: Artificial Intelligence – A Modern Approach, 3rd edn. Pearson Education Limited, Harlow, Essex (2016)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
Witten, I.H., Frank, E., Hall, M.A.: Data Mining – Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers/Elsevier, Burlington (2011)
Google Scholar
https://en.wikipedia.org/wiki/Taxicab_geometry. Redirected from “manhattan metric”. Accessed 28 Jan 2020

Download references

Author information

Authors and Affiliations

Dr.-Ing. Rainer Lutze Consulting, Wachtlerhof, Langenzenn, Germany
Rainer Lutze
FOM University of Applied Sciences, Nuremberg, Germany
Klemens Waldhör

Authors

Rainer Lutze
View author publications
You can also search for this author in PubMed Google Scholar
Klemens Waldhör
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rainer Lutze .

Editor information

Editors and Affiliations

The Open University of Japan, Chiba, Japan
Masaaki Kurosu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lutze, R., Waldhör, K. (2020). Improving Dialogue Design and Control for Smartwatches by Reinforcement Learning Based Behavioral Acceptance Patterns. In: Kurosu, M. (eds) Human-Computer Interaction. Human Values and Quality of Life. HCII 2020. Lecture Notes in Computer Science(), vol 12183. Springer, Cham. https://doi.org/10.1007/978-3-030-49065-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-49065-2_6
Published: 10 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49064-5
Online ISBN: 978-3-030-49065-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving Dialogue Design and Control for Smartwatches by Reinforcement Learning Based Behavioral Acceptance Patterns

Abstract

Similar content being viewed by others

Reinforcement Learning Applications in Health Informatics

A Reinforcement Learning Based Intelligent System for the Healthcare Treatment Assistance of Patients with Disabilities

Dialogue management in conversational agents through psychology of persuasion and machine learning

Keywords

1 Introduction

2 Requirements Specification

3 System Design

3.1 Suitability and Scope of Machine Learning

3.2 Modelling Situational Settings

3.3 Extending the Reach of Learned Examples

3.4 Applying Learned Experience to New Health Hazard Handling Dialogues

4 Discussion

5 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving Dialogue Design and Control for Smartwatches by Reinforcement Learning Based Behavioral Acceptance Patterns

Abstract

Similar content being viewed by others

Reinforcement Learning Applications in Health Informatics

A Reinforcement Learning Based Intelligent System for the Healthcare Treatment Assistance of Patients with Disabilities

Dialogue management in conversational agents through psychology of persuasion and machine learning

Keywords

1 Introduction

2 Requirements Specification

3 System Design

3.1 Suitability and Scope of Machine Learning

3.2 Modelling Situational Settings

3.3 Extending the Reach of Learned Examples

3.4 Applying Learned Experience to New Health Hazard Handling Dialogues

4 Discussion

5 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation