Abstract
Developing statistical approaches that are able to compare the probability law of qualitative trajectories can be of real interest in many fields of science such as economics and sociology, quality control or epidemiology. This work is motivated by an application in sensory analysis in which subjects indicate the succession of perceived sensations over time using a list of attributes. In Lecuelle (Food Qual Prefer 67:59–66, 2018), Semi-Markov Processes (SMPs) are introduced to model such data, allowing to take into account the dynamics via the transitions from one attribute to another as well as the duration law of each attribute. One of the major challenges of sensory analysis is to determine if two tasted products are perceived differently. For that purpose, the present paper introduces a statistical testing procedure based on the likelihood ratio between two semi-Markov processes, assuming a parametric form for the sojourn time distributions. Three approaches are evaluated to compute the p-value: a first one based on the asymptotic law of the likelihood ratio, a second one based on the parametric bootstrap and a third one based on permutations. These approaches are compared on Monte-Carlo simulated data both in terms of empirical levels under the null hypothesis and statistical powers under alternatives. We also develop partial tests to compare two processes on either their initial probabilities and transition matrices or their sojourn time distributions. Simulations show that permutation approaches perform better in nearly all situations and especially for small and moderate sample sizes. Finally, the proposed tests are illustrated on real datasets which consist in perceived sensations over time during the tasting of different chocolates and cheeses.








Similar content being viewed by others
References
Anderson TW, Goodman A (1957) Statistical inference about Markov Chains. Ann Math Stat 28:89–110
Arlot S, Blanchard G, Roquain E (2010) Some nonasymptotic results on resampling in high dimension, I: Confidence regions and II: Multiple tests. Ann Stat 38(1):51–82
Barbu VS, Limnios N (2008) Semi-Markov chains and hidden semi-Markov models toward applications: their use in reliability and DNA analysis. Springer Science + Business Media, New York
Barbu VS, Bérard C, Cellier D, Sautreuil M, Vergne N (2018) SMM: an R package for estimation and simulation of discrete-time semi-Markov models. R J
Barbu V, Karagrigoriou A, Makrides A (2017) Semi-Markov modelling for multi-state systems. Methodol Comput Appl Probab 19(4):1011–1028
Billingsley P (1961) Statistical inference for Markov processes. University of Chicago Press, Chicago
Cardot H, Lecuelle G, Visalli M, Schlich P (2019) Estimating finite mixtures of semi-Markov chains: an application to the segmentation of temporal sensory data. J R Stat Soc C 68:1281–1303
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge Core, Cambridge
Eddelbuettel D, François R (2011) Rcpp: seamless R and C++ integration. J Stat Softw 40(8):1–18
Franczak BC, Browne RP, McNicholas PD, Castura JC, Findlay CJ (2015) A Markov model for temporal dominance of sensations data. In: In 11th Pangborn symposium
Lecuelle G, Visalli M, Cardot H, Schlich P (2018) Modeling temporal dominance of sensations with semi-Markov chains. Food Qual Prefer 67:59–66
Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer Texts in Statistics, Springer, New York
Lévy P (1954) Processus semi-Markoviens. In: Erven P, Noordhoff NV (eds) Proceedings of the international congress of mathematicians, Amsterdam, vol III, pp 416–426. Groningen; North-Holland Publishing Co., Amsterdam
Limnios N, Oprişan G (2001) Semi-Markov processes and reliability. Birkhäuser, Boston
Pineau N, Schlich P, Cordelle S, Mathonnière C, Issanchou S, Imbert A (2009) Temporal dominance of sensations: construction of the TDS curves and comparison with time-intensity. Food Qual Prefer 20:450–455
R Core Team (2018) R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing (2018)
Romano JP, Wolf M (2005) Exact and approximate step-down methods for multiple hypothesis testing. J Am Stat Assoc 100(469):94–108
Smith WL (1955) Regenerative stochastic processes. Proc R Soc Ser A 232:6–31
Thomas A, Chambault M, Dreyfuss L, Gilbert CC, Hegyi A, Henneberg S, Knippertz A, Kostyra E, Kreme S, Silva AP, Schlich P (2017) Measuring temporal liking simultaneously to temporal dominance of sensations in several intakes. An application to gouda cheeses in 6 europeans countries. Food Res Int 99, 426–434
Trevezas S, Limnios N (2011) Exact MLE and asymptotic properties for nonparametric semi-Markov models. J Nonparam Stat 23:719–739
Van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
Wilks SS (1938) The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann Math Stat 9(1):60–62
Acknowledgements
Calculations were performed using HPC resources from DNUM CCUB (Centre de Calcul de l’Université de Bourgogne). Cindy Frascolla’s doctoral thesis is financially supported by the Bourgogne—Franche Comté Regional Council. We thank the referees and the associate editor for their many constructive comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Frascolla, C., Lecuelle, G., Schlich, P. et al. Two sample tests for Semi-Markov processes with parametric sojourn time distributions: an application in sensory analysis. Comput Stat 37, 2553–2580 (2022). https://doi.org/10.1007/s00180-022-01210-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-022-01210-x