Nonparametric discovery and analysis of learning patterns and autism subgroups from therapeutic data

Vellanki, Pratibha; Duong, Thi; Gupta, Sunil; Venkatesh, Svetha; Phung, Dinh

doi:10.1007/s10115-016-0971-7

Nonparametric discovery and analysis of learning patterns and autism subgroups from therapeutic data

Regular Paper
Published: 25 July 2016

Volume 51, pages 127–157, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Pratibha Vellanki ORCID: orcid.org/0000-0001-8910-8533¹,
Thi Duong¹,
Sunil Gupta¹,
Svetha Venkatesh¹ &
…
Dinh Phung¹

533 Accesses
Explore all metrics

Abstract

The spectrum nature and heterogeneity within autism spectrum disorders (ASD) pose as a challenge for treatment. Personalisation of syllabus for children with ASD can improve the efficacy of learning by adjusting the number of opportunities and deciding the course of syllabus. We research the data-motivated approach in an attempt to disentangle this heterogeneity for personalisation of syllabus. With the help of technology and a structured syllabus, collecting data while a child with ASD masters the skills is made possible. The performance data collected are, however, growing and contain missing elements based on the pace and the course each child takes while navigating through the syllabus. Bayesian nonparametric methods are known for automatically discovering the number of latent components and their parameters when the model involves higher complexity. We propose a nonparametric Bayesian matrix factorisation model that discovers learning patterns and the way participants associate with them. Our model is built upon the linear Poisson gamma model (LPGM) with an Indian buffet process prior and extended to incorporate data with missing elements. In this paper, for the first time we have presented learning patterns deduced automatically from data mining and machine learning methods using intervention data recorded for over 500 children with ASD. We compare the results with non-negative matrix factorisation and K-means, which being parametric, not only require us to specify the number of learning patterns in advance, but also do not have a principle approach to deal with missing data. The F1 score observed over varying degree of similarity measure (Jaccard Index) suggests that LPGM yields the best outcome. By observing these patterns with additional knowledge regarding the syllabus it may be possible to observe the progress and dynamically modify the syllabus for improved learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Entry Profiles of Children with Autism from Multivariate Treatment Information Using Restricted Boltzmann Machines

Data Mining of Intervention for Children with Autism Spectrum Disorder

Constructing an Assessment Tool for Conducting Social and Behavioural Interventions for Children with ASD

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

(2011) A parents guide to evidence-based practice and autism. In: The National Autism Centre, 41 Pacella Park Drive, Randolph. http://www.ids-wi.com/images/Natl_Autism_Center_Parent_Manual.pdf
Allison PD (2002) Missing data: quantitative applications in the social sciences. Br J Math Stat Psychol 55(1):193–196
Article Google Scholar
American Psychiatric Association (2013) Diagnostic and statistical manual of mental disorders, 5th edn. American Psychiatric Association, Washington DC
Ashton TM (2001) Assistive technology: the application of ABA to technology: the discrete trial trainer. J Spec Educ Technol 16(1):41–42
Article MathSciNet Google Scholar
Baio J, Autism Developmental Disabilities Monitoring Network Surveillance Year 2008 Principal Investigators CfDC, Prevention (2012) Prevalence of autism spectrum disorders: Autism and developmental disabilities monitoring network, 14 sites, United States, 2008. MMWR Surveill Summ 61(3):1–18
Cemgil AT (2009) Bayesian inference for nonnegative matrix factorisation models. Comput Intell Neurosci 2009 785152:1–17
Chueinta W, Hopke PK, Paatero P (2000) Investigation of sources of atmospheric aerosol at urban and suburban residential areas in thailand by positive matrix factorization. Atmos Environ 34(20):3319–3329
Article Google Scholar
Doshi-Velez F, Ge Y, Kohane I (2014) Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics 133(1):e54–e63
Article Google Scholar
Gershman SJ, Blei DM (2012) A tutorial on Bayesian nonparametric models. J Math Psychol 56(1):1–12
Article MathSciNet MATH Google Scholar
Ghahramani Z, Griffiths TL (2005) Infinite latent feature models and the Indian buffet process. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems 18. Vancouver, British Columbia, Canada, pp 475–482
Greer RD, McDonough SH (1999) Is the learn unit a fundamental measure of pedagogy? Behav Anal 22(1):5
Google Scholar
Gupta SK, Phung D, Venkatesh S (2012) A nonparametric bayesian poisson gamma model for count data. In: 21st international conference on pattern recognition (ICPR), 2012. IEEE. pp 1815–1818
Hastie T, Tibshirani R, Friedman JJH (2001) The elements of statistical learning, vol 1. Springer, New York
Book MATH Google Scholar
Hetzroni O, Tannous J (2004) Effects of a computer-based intervention program on the communicative functions of children with autism. J Autism Dev Disord 34(2):95–113
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Lovaas O (1987) Behavioral treatment and normal educational and intellectual functioning in young autistic children. J Consult Clin Psychol 55(1):3–9
Article Google Scholar
Moore D, Venkatesh S, Anderson A, Greenhill S, Phung D, Duong T, Cairns D, Marshall W, Whitehouse A (2013) Toby play-pad application to teach children with ASD—a pilot trial. Dev Neurorehabilitation 18(4):213–217
Obenshain MK (2004) Application of data mining techniques to healthcare data. Infect Control Hosp Epidemiol 25(8):690–695
Article Google Scholar
Olinsky A, Chen S, Harlow L (2003) The comparative efficacy of imputation methods for missing data in structural equation modeling. Eur J Oper Res 151(1):53–79
Article MathSciNet MATH Google Scholar
Prior M, Eisenmajer R, Leekam S, Wing L, Gould J, Ong B, Dowe D (1998) Are there subgroups within the autistic spectrum? A cluster analysis of a group of children with autistic spectrum disorders. J Child Psychol Psychiatry 39(06):893–902
Article Google Scholar
Ruiz FJ, Valera I, Blanco C, Perez-Cruz F (2014) Bayesian nonparametric comorbidity analysis of psychiatric disorders. J Mach Learn Res 15(1):1215–1247
MathSciNet MATH Google Scholar
Schmidt M, Mohamed S (2009) Probabilistic non-negative tensor factorisation using markov chain monte carlo. In: European signal processing conference, pp 152–155
Singer E (2005) ‘Phenome’ project set to pin down subgroups of autism. Nat Med 11(6):583–583
Article Google Scholar
Smith T (2001) Discrete trial training in the treatment of autism. Focus Autism Other Dev Disabl 16(2):86–92
Article MathSciNet Google Scholar
Teh YW, Görür D, Ghahramani Z (2007) Stick-breaking construction for the Indian buffet process. In: International conference on artificial intelligence and statistics, pp 556–563
Vellanki P, Duong T, Venkatesh S, Phung D (2014) Nonparametric discovery of learning patterns and autism subgroups from therapeutic data. In: Proceedings of 22nd international conference on pattern recognition (ICPR), pp 1829–1833
Venkatesh S, Greenhill S, Phung D, Adams B, Duong T (2012) Pervasive multimedia for autism intervention. Pervasive Mob Comput 8(6):863–882
Article Google Scholar
Venkatesh S, Phung D, Duong T, Greenhill S, Adams B (2013) Toby: early intervention in autism through technology. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 3187–3196
Verte S, Geurts HM, Roeyers H, Oosterlaan J, Sergeant JA (2006) Executive functioning in children with an autism spectrum disorder: Can we differentiate within the spectrum? J Autism Dev Disord 36(3):351–372
Article Google Scholar
Whalen C, Moss D, Ilan AB, Vaupel M, Fielding P, Macdonald K, Cernich S, Symon J (2010) Efficacy of TeachTown: basics computer-assisted intervention for the intensive comprehensive autism program in Los Angeles unified school district. Autism 14(3):179–197
Article Google Scholar
White SW, Bray BC, Ollendick TH (2012) Examining shared and unique aspects of social anxiety disorder and autism spectrum disorder using factor analysis. J Autism Dev Disord 42(5):874–884
Article Google Scholar
Williams C, Wright B, Callaghan G, Coughlan B (2002) Do children with autism learn to read more readily by computer assisted instruction or traditional book methods? A pilot study. Autism Int J Res Pract 6(1):71–91
Article Google Scholar
Zhang S, Wang W, Ford J, Makedon F (2006) Learning from incomplete ratings using non-negative matrix factorization. In: SDM, SIAM vol 6, pp 548–552

Download references

Author information

Authors and Affiliations

Center for Pattern Recognition and Data Analytics (PRaDA), Deakin University, Geelong, Australia
Pratibha Vellanki, Thi Duong, Sunil Gupta, Svetha Venkatesh & Dinh Phung

Authors

Pratibha Vellanki
View author publications
You can also search for this author inPubMed Google Scholar
Thi Duong
View author publications
You can also search for this author inPubMed Google Scholar
Sunil Gupta
View author publications
You can also search for this author inPubMed Google Scholar
Svetha Venkatesh
View author publications
You can also search for this author inPubMed Google Scholar
Dinh Phung
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Pratibha Vellanki.

Appendix

In this section, we present the derivations for the posteriors of $w_{vk}$ and $f_{kn}$, in the scenario where the dataset has missing data. X is our data matrix where the elements $x_{vn}$, corresponding to the number of LUs accumulated by a child n in a task v, are data points. Our objective is to derive the posteriors for the parameters by using the data points $x_{vn}$ that are not missing. The inference of posterior of $w_{vk}$, for a certain value of v, depends on the values $x_{vn}$ for all values of n. Similarly, the inference of $f_{kn}$ for a certain value n depends on values $x_{vn}$ for all values of v. Hence, we consider the data points in two sets $J_{v}$ and $I_{n}$ for each inference, respectively, such that $J_{v}$ contains all the non-missing values from the array $x_{v,1:N}$ and $I_{n}$ contains all of those from $x_{1:V,n}$.

Let us represent the sum $\sum _{i=1}^{K}f_{in}z_{in}w_{vi}$ as $\eta _{i}$, the term including parameters for all values $i=1:K$, $\sum _{i=1,i\ne k}^{K}f_{in}z_{in}w_{vi}$ as $\eta _{-i}$, the term including parameters for all values of i except for k, and $f_{kn}z_{kn}w_{vk}$ as $\eta _{k}$, the term for the condition when i takes the value k.

The posterior of $w_{vk}$ is
$$\begin{aligned} p(w_{vk}\mid Z,F,X)\propto & {} p(X\mid Z,F,W)p(w_{vk}\mid \alpha _{0},\beta _{0})\\= & {} \left( \prod _{n\in J_{v}}p(x_{vn}\mid \eta _{i}\right) Gamma (\alpha _{0},\beta _{0})\\= & {} \left( \prod _{n\in J_{v}}\frac{\left( \eta _{i}\right) ^{x_{vn}}e^{-\left( \eta _{i}\right) }}{x_{vn}}\right) \times \frac{\beta _{0}^{\alpha _{0}}}{\Gamma (\alpha _{0})}w_{vk}^{\alpha _{0}-1}e^{-\beta _{0}w_{vk}}\\\propto & {} w_{vk}^{\alpha _{0}-1}e^{-\beta _{0}w_{vk}}\times \prod _{n\in J_{v}}\left( \left( \eta _{i}\right) ^{x_{vn}}e^{-\left( \eta _{i}\right) }\right) \\= & {} w_{vk}^{\alpha _{0}-1}e^{-\beta _{0}w_{vk}}\times \prod _{n\in J_{v}}\left( \left( \eta _{-i}+\eta _{k}\right) ^{x_{vn}}e^{-\left( \eta _{-i}+\eta _{k}\right) }\right) \\\propto & {} w_{vk}^{\alpha _{0}-1}e^{-\beta _{0}w_{vk}}\prod _{n\in J_{v}}\left( \left( \eta _{-i}+\eta _{k}\right) ^{x_{vn}}e^{-\left( \eta _{-i}+\eta _{k}\right) }\right) \end{aligned}$$
In order to solve the above equation, we take the help of an auxiliary variable. Let us consider that the probability $p(w_{vk})$ is proportional to the unnormalised exponential function $p^{*}(w_{vk})$, where $p^{*}(w_{vk})$ is given by and can be expanded as a binomial function as follows:
$$\begin{aligned} p^{*}(w_{vk})= & {} \left( \eta _{-i}+\eta _{k}\right) ^{x_{vn}}\\= & {} \sum _{j=0}^{x_{vn}}{x_{vn}\atopwithdelims ()j} \left( \eta _{k}\right) ^{j}\left( \eta _{-i}\right) ^{x_{vn}-j} \end{aligned}$$
Hence, we have
$$\begin{aligned} p(w_{vk})\propto & {} \left( \eta _{-i}+\eta _{k}\right) ^{x_{vn}} \end{aligned}$$
Now let $r_{vn}$ be an auxiliary variable. We aim to define a probability $p(w_{vk},r_{vn})$ proportional to $p^{*}(w_{vk},r_{vn})$ such that $\sum _{r_{vn}}p^{*}(w_{vk},r_{vn})=p^{*}(w_{vk})$. So let $p^{*}(w_{vk},r_{vn})={x_{vn}\atopwithdelims ()r_{vn}} \left( \eta _{k}\right) ^{r_{vn}}\left( \eta _{-i}\right) ^{x_{vn}-r_{vn}}$, where $r_{vn}=\{0,1,2,\ldots ,x_{vn}\}$. Hence, we have
$$\begin{aligned} \sum _{r_{vn=0}}^{x_{vn}}p^{*}(w_{vk},r_{vn})= & {} \sum _{r_{vn=0}}^{x_{vn}}{x_{vn}\atopwithdelims ()r_{vn}} \left( \eta _{k}\right) ^{r_{vn}}\left( \eta _{-i}\right) ^{x_{vn}-r_{vn}}\\= & {} p^{*}(w_{vk}) \end{aligned}$$
Additionally, we have
$$\begin{aligned} p(w_{vk}\mid r_{vn})= & {} \frac{p(w_{vk},r_{vn})}{p(r_{vn})}\\\propto & {} p(w_{vk},r_{vn})\\\propto & {} p^{*}(w_{vk},r_{vn})\\= & {} {x_{vn}\atopwithdelims ()r_{vn}} \left( \eta _{k}\right) ^{r_{vn}}\left( \eta _{-i}\right) ^{x_{vn}-r_{vn}}\\ p(r_{vn}\mid w_{vk})= & {} \frac{p(w_{vk},r_{vn})}{p(w_{vk})}\\\propto & {} \frac{p^{*}(w_{vk},r_{vn})}{p^{*}(w_{vk})}\\= & {} \frac{{x_{vn}\atopwithdelims ()r_{vn}} \left( \eta _{k}\right) ^{r_{vn}}\left( \eta _{-i}\right) ^{x_{vn}-r_{vn}}}{\left( \eta _{-i}+\eta _{k}\right) ^{x_{vn}}}\\= & {} {x_{vn}\atopwithdelims ()r_{vn}} \left( \frac{\eta _{k}}{\eta _{_{-i}}+\eta _{k}}\right) ^{r_{vn}}\left( \frac{\eta _{-i}}{\eta _{_{-i}}+\eta _{k}}\right) ^{x_{vn}-r_{vn}}\\ \end{aligned}$$
Hence, the conditional distributions have a form of the binomial distribution. After substituting back the values of $\eta _{-i}$and $\eta _{k}$, if we sample $r_{vn}$ from such a distribution we can approximate the binomial expansion as follows:
$$\begin{aligned} R_{vn}\sim & {} Binomial \left( x_{vn},\frac{z_{kn}f_{kn}w_{vk}}{\sum _{i\ne k}z_{in}f_{in}w_{vi}+z_{kn}f_{kn}w_{vk}}\right) ,~~\forall n\in J_{v} \end{aligned}$$

$$\begin{aligned} \left( \sum _{i\ne k}z_{in}f_{in}w_{vi}+z_{kn}f_{kn}w_{vk}\right) ^{x_{vn}}\propto & {} (z_{kn}f_{kn}w_{vk})^{R_{vn}} \end{aligned}$$
Hence, we have
$$\begin{aligned} p(w_{vk}\mid Z,F,X)\propto & {} w_{vk}^{\alpha _{0}-1}e^{-\beta _{0}w_{vk}}\prod _{n\in J_{v}}\left( (f_{kn}z_{kn}w_{vk})^{R_{vn}}e^{-\left( f_{kn}z_{kn}w_{vk}\right) }\right) \\\propto & {} w_{vk}^{\alpha _{0}+\sum _{n\in J_{v}}R_{vn}-1}e^{-(\beta _{0}+\sum _{n\in J_{v}}f_{kn}z_{kn})w_{vk}} \end{aligned}$$
The above expression is in gamma distribution form $w_{vk}\sim Gamma (\alpha _{0}',\beta _{0}')$, where
$$\begin{aligned} \alpha _{0}'= & {} \alpha _{0}+\sum _{n\in J_{v}}R_{vn}\\ \beta _{0}'= & {} \beta _{0}+\sum _{n\in J_{v}}f_{kn}z_{kn} \end{aligned}$$
The posterior of $f_{kn}$ is similarly calculated as:
$$\begin{aligned} p(f_{kn}\mid Z,W,X)\propto & {} p(X\mid Z,F,W)p(f_{kn}\mid \alpha _{1},\beta _{1})\\= & {} \left( \prod _{m\in I_{n}}p(x_{vn}\mid \sum _{i=1}^{K}f_{in}z_{in}w_{vi})\right) Gamma (\alpha _{1},\beta _{1}) \end{aligned}$$
Hence, we have
$$\begin{aligned} p(f_{kn}\mid Z,W,X)\propto & {} f_{kn}^{\alpha _{1}-1}e^{-\beta _{1}f_{kn}}\prod _{v\in I_{n}}\left( (f_{kn}z_{kn}w_{vk})^{T_{vn}}e^{-\left( f_{kn}z_{kn}w_{vk}\right) }\right) \\\propto & {} f_{kn}^{\alpha _{1}+\sum _{v\in I_{n}}T_{vn}-1}e^{-(\beta _{1}+\sum _{v\in I_{n}}z_{kn}w_{vk})f_{kn}} \end{aligned}$$
The above expression is in gamma distribution for $f_{kn}\sim Gamma (\alpha _{1}',\beta _{1}')$, where
$$\begin{aligned} \alpha _{1}'= & {} \alpha _{1}+\sum _{v\in I_{n}}T_{vn}\\ \beta _{1}'= & {} \beta _{1}+\sum _{v\in I_{n}}z_{kn}w_{vk} \end{aligned}$$
and the auxiliary variable $T_{vn}$ is sampled similar to $R_{vn}$ from
$$\begin{aligned} T_{vn}\sim & {} Binomial \left( x_{vn},\frac{f_{kn}z_{kn}w_{vk}}{\sum _{i\ne k}f_{in}z_{in}w_{vi}+f_{kn}z_{kn}w_{vk}}\right) ,~~\forall n\in I_{n} \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vellanki, P., Duong, T., Gupta, S. et al. Nonparametric discovery and analysis of learning patterns and autism subgroups from therapeutic data. Knowl Inf Syst 51, 127–157 (2017). https://doi.org/10.1007/s10115-016-0971-7

Download citation

Received: 17 April 2015
Revised: 11 April 2016
Accepted: 07 July 2016
Published: 25 July 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10115-016-0971-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric discovery and analysis of learning patterns and autism subgroups from therapeutic data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Entry Profiles of Children with Autism from Multivariate Treatment Information Using Restricted Boltzmann Machines

Data Mining of Intervention for Children with Autism Spectrum Disorder

Constructing an Assessment Tool for Conducting Social and Behavioural Interventions for Children with ASD

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now