Abstract
Two new non-parametric tests are proposed based on continuous one-dimensional random projections. The first one addresses central symmetry and the second addresses independence. These tests are implemented for finite and infinite dimensional (functional) data sets. Both tests are distribution-free and universally consistent. Additionally, different techniques are proposed to improve the power of the tests. Promising results have been obtained by comparing the new tests with existing ones using simulation study. Real data in Banach spaces have been used to develop an application.






Similar content being viewed by others
Notes
https://archive.ics.uci.edu/ml/datasets/EEG+Database. A detailed description of the data set is available in Zhang et al. (1995). The database contains measurements from 64 electrodes that were placed on the subjects’ scalps and sampled at 256 Hz (3.9-msec epoch) for 1 sec. In this study, 9 triples of nodes were used; 27 of the 64 electrodes are shown in Fig. 4. Each subject was exposed to either a single stimulus (S1) or two stimuli (S1 and S2). The stimuli were pictures of objects chosen from a picture set. Each observation has 256 measures of one second, and the id of the subject, the group and the sample are labeled. The EEG emission values are expressed in micro volts. By removing noise spectral decomposition using the fast Fourier transform of the signals was performed. The association between the different triples is studied as shown in Fig. 4. To simplify the procedure, the remaining signals were not processed.
References
Aki S (1993) On nonparametric tests for symmetry in \(R^m\). Ann Inst Stat Math 45:787–800
Albert P, Ratnasinghe D, Tangrea J, Wacholder S (2001) Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol 154:687–693
Azzalini A, Valle AD (1996) The multivariate skew-normal distribution. Biometrika 83:715–726
Blough DK (1989) Multivariate symmetry via projection pursuit. Ann Inst of Stat Math 41:461–475
Brandwein A, Strawderman W (1991) Generalizations of James–Stein estimators under spherical symmetry. Ann Stat 19:1639–1650
Cramér H, Wold H (1936) Some theorems on distribution functions. J Lond Math Soc 11:290–294
Cuesta-Albertos JA, Fraiman R, Ransford T (2006) Random projections and goodness of fit tests in infinite-dimensional spaces. Bull Braz Math Soc 37:477–501
Cuesta-Albertos JA, Fraiman R, Ransford T (2007) A sharp form of the Cramer–Wold theorem. J Theor Prob 20:201–209
Cuevas A, Fraiman R (2009) On depth measures and dual statistics. A methodology for dealing with general data. J Multivariate Anal 100:753–766
Dauwels J, Vialatte F, Cichocki A (2010) Diagnosis of alzheimers disease from EEG signals: where are we standing. Curr Alzheimer Res 7:487–505
Dyckerhoff R, Ley C, Paindaveine D (2015) Depth-based runs test for bivariate central symmetry. Ann Inst Stat Math 67:917–941
Einmahl J, Gan Z (2016) Testing for central symmetry. J Stat Plann Inference 169:27–33
Fermaninan JD (2005) Goodness-of-fit tests for copulas. J Multivariate Anal 95:119–152
Fermaninan JD, Radulovic D, Wegkamp M (2004) Weak convergence of empirical copula processes. Bernoulli 10:847–860
Friedman JH, Stuetzle W (1981) Projection pursuit regression. J Am Stat Assoc 76:817–823
Friedman JH, Tukey JW (1974) A projection pursuit algorithm for exploratory data analysis. IEEE Trans Comput 23:881–890
Genest C, Quessy JF, Rémillard B (2007) Asymptotic local efficiency of Cramér-Von Mises tests for multivariate independence. Ann Stat 35:166–191
Genest C, Rémillard B (2008) Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Ann l’ins Henri Poincaré (B) Probab Stat 44:1096–1127
Hallin M, Paindaveine D (2002) Optimal tests for multivariate location based on interdirections and pseudo-Mahalanobis ranks. Ann Stat 30:1103–1133
Heathcote C, Rachev S, Cheng B (1995) Testing multivariate symmetry. J Multivariate Anal 54(1):91–112
Jones M, Sibson R (1987) What is projection pursuit? J R Stat Soc Ser A 150:1–36
Ley, C.: Univariate and multivariate symmetry: statistical inference and distributional aspects. Ph.D. thesis, Université libre de Bruxelles (2010)
Marden, J.: Multivariate analysis, design of experiments, and survey sampling, chap. 14. Multivariate rank test, pp. 401–432. CRC Press (1999)
Mason DM, Schuenemeyer JH (1983) A modified Kolmogorov–Smirnov test sensitive to tail alternatives. Ann Stat 11:933–946
Neuhaus G, Zhu L (1998) Permutation tests for reflected symmetry. J Multivariate Anal 67(2):129–153
Padgett WJ, Taylor RL (1973) Laws of large number for normed linear spaces and Certain Fréchet spaces. Springer, Berlin
Puri ML, Sen PK (1971) Nonparametric methods in multivariate analysis. Wiley, New York, p 440
Sen PK, Chatterjee SK (1973) On Kolmogorov-Smirnov type test for symmetry. Ann Inst Stat Math 25:288–300
Sen PK, Puri ML (1967) On the theory of rank order tests for location in the multivariate one sample problem. Ann Math Stat 38:1216–1228
Serfling, R.: Multivariate symmetry and asymmetry. In: Encyclopedia of Statistical Sciences, Second Edition, vol. 8, pp. 5338–5345. J. Wiley & Sons (2006)
Shohat JA, Tamarkin JD (1943) The problem of moments. Mathematical Surveys and Monographs
Székely GJ, Rizzo ML (2013) The distance correlation t-test of independence in high dimension. J Multivariate Anal 117:193–213
Székely GJ, Rizzo ML, Bakirov N (2007) Measuring and testing dependence by correlation of distances. Ann Stat 35:2769–2794
Takács L (1967) Combinatorial methods in the theory of stochastic processes. Wiley, Hoboken
Wilks SS (1935) On the independence of k sets of normally distributed statistical variables. Econometrica 3:309–326
Zhang X, Begleiter H, Porjesz B, Wang W, Litke A (1995) Event related potentials during object recognition tasks. Brain Res Bull 38(6):531–538
Acknowledgements
The authors gratefully acknowledge the constructive comments of the referees which help to improve the quality of the paper significantly. This work was partially supported by grant ID2014-48, CSIC, Udelar.
Author information
Authors and Affiliations
Corresponding author
Appendix.
Appendix.
Proof of Theorem 3
Let M and Q be the probability measures induced by the random elements \(\mathbf{X }\) and \(-\mathbf{X }\). Then
From Theorem 2, because \(\mathcal {E}(M,Q)\) has a positive \(\mu \)-measure, it can be concluded that \(Q=M\), namely, \(\mathbf{X }\) and \(-\mathbf{X }\) have the same distribution, so \(\mathbf{X }\) is a centrally symmetric random element.
Proof of Theorem 4
Given two probability measures M and Q on \(\textit{E} \times \textit{E}\) over the determinant class of the half-spaces \(A=\{ (x,y) \in E \times E / h(x,y) \le t, h \in (\textit{E} \times \textit{E})^* \}\),
The maximum is used as the norm on \(\textit{E}\times \textit{E}\), namely
therefore,
So,
Thus, the moments are finite, and Carleman’s condition for the measure M holds.
Let the set \(\mathcal {E}(M,Q) = \{ h \in (E \times E)^* / Q_{h}=M_{h} \}\). Let \(h \in \mathcal {E}(\mathbf{X },\mathbf{Y })\). Now, setting \(f(\mathbf{X })= h(\mathbf{X },0)\) and \(g(\mathbf{Y })=h(0,\mathbf{Y })\), one has that \(h(\mathbf{X },\mathbf{Y })=f(\mathbf{X })+g(\mathbf{Y })\). Then,
Then, \(h \in \mathcal {E}(M,Q)\), implying that \(\mathcal {E}(M,Q)\) has a positive \(\mu \)-measure. By Theorem 2, it follows that \(M=Q\), and thus \(\mathbf{X }\) and \(\mathbf{Y }\) are independent.\(\square \)
Proof of Theorem 5
Under \(H_0\), since \(\mathbf{X }_1\) is symmetric for any \(h \in \textit{E}^*\), Theorem 1 implies that \(h(\mathbf{X }_1)\in \mathbb {R} \) is also symmetric and therefore it fulfills the condition given in (11) for the null assumption in the one dimensional case. Let \(h(\mathbf{Y }_1) \ge h(\mathbf{Y }_2) \ge \ldots \ge h(\mathbf{Y }_n)\) be the order statistics (sorted from the largest to the smallest) of the absolute values \(\left| {h(\mathbf{X }_1} \right| , \left| {h(\mathbf{X }_2} \right| , \ldots ,\left| {h(\mathbf{X }_n} \right| \). Let \(t^{h}_{n,i}= F_{n}(-h(\mathbf{Y }_{i}))\). Then, \(0 \le t^{h}_{n,1} \le t^{h}_{n,2} \le \ldots \le t^{h}_{n,n} \le F(0)= 1/2\). Because F is continuous, ties not occur a.s. and therefore,
Via the canonical transformation, one may define \(V^{h}_n(t)= n^{1/2}[G^{h}_n(t)-t] \) with \(0<t<1\), where \(G^{h}_n(t)= \frac{1}{n}\sum _{i=1}^{n} \mathbf 1 _{[-\infty , t )} \left( F^{h}(X_i) \right) \), and define
Then, \(\tilde{V}^{h}_n(t)\) is a stochastic process defined on (0, 1 / 2) having n jumps of 1 or \(-1\) at the points \(t^{h}_{n,1},t^{h}_{n,2}, \ldots , t^{h}_{n,n}\). Let \(p_{i,j}= P \left( h(\mathbf{Y }_{n-i+1})= \vert h(\mathbf{X }_j \vert \right) \). Then
Let sg(.) stand for the sign function, and \(\vert \cdot \vert \) for the absolute value function. It is well known that the vectors
are independent under the null assumption.
Then, the jumps of \(n ^{1/2} \tilde{V}^{h}_n(t)\) at \(t^{h}_{n,1},t^{h}_{n,2}, \ldots , t^{h}_{n,n}\) are independent. Therefore, under \(H_0\), the distributions of the statistics \(D^h_-(n)\) and \(D^h_+(n)\) follow the distribution of the maximum of a symmetric random walk of n steps from the origin, and \(n D^{h}(n)\) follows the distribution of the maximum of the absolute values of the random walk as can be seen in Takács (1967).\(\square \)
Proof of Theorem 6
Write P for the distribution of X and Q for the distribution of \(-X\). The set \(\mathcal {E}(P,Q)\) has H-measure zero in \(\mathbb {R}^d\), since if the H-measure were positive, then by Theorem 3 X and \(-X\) would have the same distribution, which contradicts the hypothesis.
For each \(h \in \mathcal {E}^{c}(P,Q)\), by Theorem 1, \(h(\mathbf{X })\) is not symmetric, so if we define
it holds that
There exists, under \(H_1\), \(t_h \in \mathbb {R}\) such that
namely,
By the Glivenko–Cantelli theorem, \(\sup _{x } \vert F^{h}_n(x)- F^{h}(x) \vert \mathop {\longrightarrow }\limits ^{a.s}0\), which entails
almost surely when \(n \rightarrow +\infty \).
Rights and permissions
About this article
Cite this article
Fraiman, R., Moreno, L. & Vallejo, S. Some hypothesis tests based on random projection. Comput Stat 32, 1165–1189 (2017). https://doi.org/10.1007/s00180-017-0732-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-017-0732-4