General matching quantiles M-estimation

https://doi.org/10.1016/j.csda.2020.106941Get rights and content

Abstract

Matching quantiles estimation (MQE) is a useful technique that allows one to find a linear combination of a set of random variables that matches the distribution of a target random variable. Since it is based on ordinary least-squares (OLS), it may be sensitive to outlier observations of the target random variable. A general matching quantiles M-estimation (MQME) method is thus proposed, which is resistant to outlier observations of the target random variable. Given that in most applications, the number of variables p may be large, a ‘sparse’ representation is highly desirable. The MQME is combined with the adaptive Lasso penalty so it can select informative variables. An iterative algorithm based on M-estimation is developed to compute MQME. The proposed matching quantiles M-estimate is consistent, just like the MQE. Extensive simulations are provided, in which efficient finite-sample performance of the new method is demonstrated. In addition, an illustrative real case study is presented.

Introduction

The matching quantiles estimation (MQE) method was first proposed by Sgouropoulos et al. (2015) as a way to address the problem of estimating representative portfolios for backtesting counterparty credit risks. The goal is to construct a representative portfolio such that its distribution matches that of the total counterparty portfolio. Instead of matching the two distributions directly, MQE aims to minimize the mean-squared difference between the quantiles of the two distributions across all levels.

The potential usefulness of matching quantiles methods extends beyond estimating representative portfolios. Sgouropoulos et al. (2015) pointed out that the MQE could also be applied to other (non-finance-related) contexts, such as in atmospheric sciences where measurements are not necessarily taken simultaneously. The idea of matching quantiles has been explored in other applications. Dominicy and Veredas (2013) introduced a method of simulated quantiles and used it to analyze the twenty-two financial indexes. Li et al. (2010) developed an equidistant quantile-matching (EQM) method for bias correction of monthly precipitation and temperature fields data published by the Intergovernmental Panel on Climate Change and reported that the method was more efficient than traditional direct distribution function mapping. More recent work by Srivastav et al. (2014) presents a methodology based on EQM for updating the intensity–duration–frequency (IDF) curves under climate change.

Although MQE achieves high goodness (see Eq. (4.3) for the definition of a measure of goodness) in matching distributions, it is sensitive to outliers due to the fact that it involves optimizing an ordinary least squares objective; thus, the presence of outliers will negatively impact its performance. From a number of classic works in literature, one learns that there are statistical procedures one can potentially use to modify the MQE that can minimize its sensitivity to outliers. M-estimation based procedures can play important and complementary roles in forming a more robust MQE. M-estimation, a maximum likelihood type estimation, was originally proposed in Huber (1964). Since a proper choice of this function can result in robustness against outliers, M-estimation has received considerable attention in the literature and has been applied to many fields of study. Some recent examples include (1) Lambert-Lacroix and Zwald (2011) proposed an M-estimation by combining Huber’s criterion and the Lasso penalty, which is resistant to heavy-tailed errors or outliers in observations of the response variable; (2) Zhang et al. (2016) applied an adaptive Huber’s M-estimation to the cubature Kalman filter to handle abnormal measurement noise, which resulted in advantages such as increased estimation accuracy, outlier-robustness, and reliability, as demonstrated in simulation studies; (3) Ollila et al. (2016) introduced two penalized M-estimation methods for the problem of joint estimation of group covariance matrices.

In this paper, we propose a general enhancement of MQE by replacing the OLS estimation with M-estimation. We show that in addition to being resistant to outliers, the proposed matching quantiles M-estimate, like MQE, is consistent. The proposed MQME can handle situations when the sample size n and the number of candidate variables p is big, but the number of relevant variables is small. This is common in many modern problems. This suggests that a ‘sparse’ matching quantiles estimate is highly desirable. Therefore, a sparse MQME is also developed by combining MQME with the adaptive Lasso penalty. As with the original MQME, we expect the ‘sparse’ variant to also be robust to outlier observations.

The rest of this paper is organized as follows. In Section 2, we introduce the MQME method. We discuss its theoretical properties in Section 3. Numerical experiments of varying designs are explored in Section 4, followed by a real case study of the stock market index in Hong Kong during the period of 2013–2016 in Section 5. Finally, we draw our conclusions in Section 6. All proofs to any presented theoretical results can be found in Appendix.

The following notations will be used in subsequent sections:

  • Rp denotes the real p-dimensional space.

  • X=(X1,X2,,Xp)T is a (column) vector of p random variables. {X1,,Xn} is the set of the n observations of X. (Note that the boldface is used only if p>1.)

  • Y is the target random variable. {Y1,Y2,,Yn} is the set of the n observations of Y.

  • β=(β1,β2,,βp)T is a p-dimensional regression coefficient vector.

  • For some generic random variable ξ, L(ξ), Fξ() and fξ() respectively denote its distribution, distribution function and probability density function. In addition, Qξ(α) denotes its αth quantile, i.e., P{ξQξ(α)}=α,forα[0,1].

  • For some generic collection of n samples {ξ1,ξ2,,ξn} of a random variable ξ, let the corresponding order statistics be ξ(1)ξ(2)ξ(n). Let Fn,ξ(x)=n1i=1nI{ξix} be the empirical distribution function, and denote its αth quantile by Qn,ξ(α), for α[0,1].

  • Denote the convergence in probability by ‘p’ and the convergence almost surely by ‘a.s.’.

Section snippets

The methods

In this section, we provide details of the proposed MQME method. In Section 2.1, we formally introduce MQME. In Section 2.2, we propose the sparse MQME. In Section 2.3, we present an iterative algorithm for computing both MQME and sparse MQME. In Section 2.4, we discuss the selection of tuning parameters.

Theoretical properties

In this section, the convergence of the aforementioned iterative algorithm, and the statistical properties of the matching quantiles M-estimate are presented. Before proceeding, we make the following assumptions.

(A1) ρ(x) is a convex function satisfying that ρ(x)ρ(0)=0, and is Lipschitz continuous, that is, there exists a constant M0, such that for any x1,x2R, |ρ(x1)ρ(x2)|M|x1x2|.

(A2) For any 0<τ0<τ1<12, there exists Ωn such that (i) infQξ(α)Ωnfξ(Qξ(α))=n(τ1τ0); (ii) supQξ(α)Ωn|fξ(Qξ

Simulations study

The simulations are performed under different scenarios, without or with outliers. The L2, L1, and Huber discrepancy functions are chosen for comparison purpose. We remark that the tuning parameters are chosen by using five-fold cross-validation. For convenience, the MQME method based on Huber ρc (c>0), L1, and L2 discrepancy functions are abbreviated as HUBER, LAD, and LS, respectively in this section.

A real case study

In this section, a real example is considered for investigating the performance of (sparse) MQME. Our purpose is to assemble a representative portfolio with different securities that matches various characteristics of a benchmark index. The function ρ() is chosen to be L2, L1 or Huber discrepancy function. The tuning parameters c and λ are chosen via five-fold cross-validation.

We apply the proposed method to the stock market index in Hong Kong during the period of 2013–2016. These data are

Conclusions

In this paper, we extend the MQE to the more general MQME that is resistant to outlier observations. MQME integrated with the adaptive Lasso penalty encourages sparsity in the estimate. Since MQME does not admit an explicit solution, we propose an iterative algorithm to solve it. The consistency of matching quantiles M-estimate are investigated based on the assumptions that are weaker than those of MQE made in Sgouropoulos et al. (2015). We demonstrate the effectiveness of MQME through

Acknowledgments

This work is supported by Natural Sciences and Engineering Research Council of Canada (RGPIN-2017-05720). The first author also gratefully acknowledges the financial support from the China Scholarship Council (Grant No. 201506180073). The authors would like to thank the associate editor and the anonymous reviewer for the critical comments and constructive suggestions which have led to the improvement of this article.

References (22)

  • JiangY. et al.

    Robust estimation using modified Huber’s functions with new tails

    Technometrics

    (2019)
  • Cited by (2)

    View full text