Abstract
We test whether a more principled treatment of delay estimation in lensed photon streams, compared with the standard kernel estimation method, can have benefits of more accurate (less biased) and/or more stable (less variance) estimation. To that end, we propose a delay estimation method in which a single latent inhomogeneous Poisson process underlying the lensed photon streams is imposed. The rate function model is formulated as a linear combination of nonlinear basis functions. Such unifying rate function is then used in delay estimation based on the corresponding Innovation Process. This method is compared with a more straightforward and less principled baseline method based on kernel estimation of the rate function. Somewhat surprisingly, the overall emerging picture is that the theoretically more principled method does not bring much practical benefit in terms of the bias/variance of the delay estimation. This is in contrast to our previous findings on daily flux data.
1 Introduction
Time delays between images of strongly-lensed distant variable sources can serve as a valuable tool for cosmography, provided that time delays between the image fluxes can be accurately measured (e.g. [8, 16]). A number of methods have been developed to accurately estimate time delays. These include the dispersion spectra method [3] and kernel-based method with variable width (K-V) [4, 5]. Actively studied strong quasars with time-delay measurements include RXJ1131−1231 [21] and B1608+656 [6, 8]; Q0957+561 (e.g. [9]). Available data are usually in the form of daily measurements which can be used to predict longer (days and months) delays. Current methods in astrophysics are solely rooted in this scenario. However, when countering the problem of shorter delays (e.g. hours), daily measurements are insufficient and one needs to investigate the individual arrival times of photons.
Poisson process can applied as a model for photon streams [15]. To resolve the delay in gravitationally lensed photon streams one can use the standard kernel based estimation of the inhomogeneous Poisson process rate function on individual photon streams and then try to time-shift the rate function estimates so as the overlap is maximized. Another, more principled alternative is to impose that the source of the delayed photon streams is the same and we simply observe different realizations from the same inhomogeneous Poisson process, gravitationally delayed in time. We study whether, comparing with the standard kernel based baseline, such a principled approach can bring benefits in terms of more stable (less variance) estimation.
Normally, delay estimation would be done over streams of photons from a given energy band and then unified over a multitude of energy bands. The baseline and principled delay estimation methods are then compared in a controlled experimental setting using synthetic photon fluxes with known imposed delay from a variety of inhomogeneous processes assumed to come from a single energy band. To our best knowledge this is the first systematic study that addresses the problem of delay estimation on lensed photon streams. We do not perform experiments on real data, since no large real photon streams from known delayed systems with short time delay are available. Nevertheless, this study serves as a proof of concept and be readily used once appropriate lensed photon streams become available.
2 Kernel Based Delay Estimation in Lensed Photon Streams
For the sake of simplicity we will deal with the case of two lensed photon streams A and B from the same source. All techniques presented in this paper can be easily generalized to multiple streams. We assume that the observed photon streams can be accounted for by a Poisson process (e.g. [18]). In the non-homogeneous Poisson process (NHPP) the mean rate function \(\lambda (s)\) varies over time s. Given a series of arrival times \(s_1, s_2, ..., s_S\) over an interval [0, T], the rate function is commonly estimated by imposing a (Gaussian) kernel of width r on top of each arrival time \(s_i\),
The rate function estimate (up to scaling) is then [12–14]
We will refer to this method as Kernel Rate Estimation (KRE).
Suppose that we observe two lensed photon streams \(\{s^A_i\}_{i=1}^{S^A}\) and \(\{s^B_i\}_{i=1}^{S^B}\) from the same source. On each stream we produce a kernel based estimate of the rate function \(\hat{\lambda }^A(s)\), \(\hat{\lambda }^B(s)\). Given a suggested time delay \(\varDelta \), the closeness of the rate estimates (under the delay \(\varDelta \)) can be evaluated e.g. through the mean square difference eventuated on a regular grid of time stamps \(\{z_j\}_{j=1}^Z \) in the relevant time interval,
The delay is then estimated through minimization of \(d_2(\hat{\lambda }^A,\hat{\lambda }^B; \varDelta )\) with respect to \(\varDelta \) (e.g. via gradient descent). In the following sections we will introduce two variants of delay estimation based on innovation process corresponding to the underlying Poisson process.
3 Innovation Process Based Estimation (IPE)
Recall that if event counts can be modeled by Poisson distribution with mean rate \(\lambda \), then the inter-arrival times are distributed with exponential distribution with mean \(\lambda ^{-1}\). We denote the differences between two consecutive arrival times by \(d^A=\{d^A_i\}_{i=1}^{D^A}\) and \(d^B=\{d^B_i\}_{i=1}^{D^B}\), where \(d^A_i = s^A_{i+1}-s^A_i \) and \(d^B_i = s^B_{i+1}-s^B_i\), respectively. Our goal is to find a probabilistic model that maximizes the probability \(P(d ^A,d ^B |\lambda (s))\),
where \(P(d|\lambda ) = \lambda e^{-\lambda d}\). We impose a kernel based model on the common rate function underlying both streams (expressed for stream A):
with kernels of width \(r_o\), centered at \(c_j, j=1,2\cdots J\) and the J free parameters \(w_j\) collected in vector \(\varvec{w}\). \(K_g(s; {\varvec{c}}, r_o)\) is a vector of kernel evaluations \(K_g(s; c_j,r_o) \) at all centers of \({\varvec{c}} = (c_1, c_2, ..., c_J)\). The rate function of stream B is a time-delayed (by \(\varDelta \)) version of the one for stream A:
We thus obtain
leading to the error functional (negative log likelihood),
We minimize E w.r.t two parameters \((\varvec{w}, \varDelta )\) via gradient descent. To that end we plug (5) and (6) into (8),
leading to,
where \(\varvec{1}\) is a vector of 1’s and
4 Parameters Initialization
Gaussian kernels have two parameters need to be determined, in particular kernel centers \(\{c_j\}^{J}_{j=1}\) and the kernel width r. As explained above, in KRE, kernels are centered at each photon’s arrival time, whereas in IPE, the centers \(c_j\) are uniformly distributed across the time period [0, T].
The kernel width determines the degree of smoothing for the underlying rate function. For KRE, we apply a method for selecting the width based on the principle of minimizing the mean integrated square error (MISE) proposed by [19]. For IPE, the kernel width \(r_o\) is optimized using cross-validation method proposed in [4, 10]. The algorithm partitions the data into 10 blocks of equal length \(\mathcal {L}\). The i-th validation set \(\varvec{\mathcal {V}}_i\), \(i = 1,2\cdots \mathcal {L}\), is obtained by collecting the i-th element of each block. The rest of the data is the “training set". We then fit our models on the training set and use the validation set \(\varvec{\mathcal {V}}_i\) to calculate the cost function E over a range of suggested width values \(r_o\in (L_{r_o},U_{r_o})\). This procedure is repeated \(\mathcal {L}\) times for each validation set \(\varvec{\mathcal {V}}_i\), \(i = 1,2\cdots \mathcal {L}\). The chosen \(r_o\) is the one yielding the smallest average cost E across the folds \(i = 1,2\cdots \mathcal {L}\).
The IPE weight vector \(\varvec{w}\) is initialized using the rate function estimates readily provided by the KRE model. However, the rate functions obtained by KRE on streams A and B need to be scaled to represent the underlying rate of the non-homogeneous Poisson processFootnote 1. Given the KRE-estimated rate functions on streams A and B, \(\hat{\lambda }^A(s)\), \(\hat{\lambda }^B(s)\), respectively, the overall KRE rate function is their average
The scaling factor \(\vartheta \) is obtained by imposing the rate function \(\lambda (s)=\vartheta \hat{\lambda }(s)\) and minimizing (9) with respect to \(\vartheta \). Denoting \(\hat{\lambda }(s^A_i)d^A_i\) and \(\hat{\lambda }(s^B_i) d^B_i\) by \(q^A_i\) and \(q^B_i\), respectively, it can be shown that the minimum is obtained at
Setting of IPE weights to match the rate function \(\lambda (s)\) can then be done by imposing a regular \((s_1, s_2, ..., s_N)\) grid on [0, T], evaluating the rate values on the grid, \(\varvec{{x}} = (\hat{\lambda }(s_1),\hat{\lambda }(s_2)\ldots \hat{\lambda }(s_N) )^\intercal \), and solving
where \(\mathbf {K}\) is an \(N\times N\) matrix
5 Experiments
To test and compare the methodologies suggested above, we performed controlled experiments on synthetic data generated from non-homogeneous Poisson processes. From each given non-homogeneous Poisson processes we generated two series A and B of arrival times, the series B was then time-shifted by a known delay \(\varDelta \).
The rate functions defining non-homogeneous Poisson processes were obtained by superimposing G Gaussian functions of fixed width \(r_g\) positioned on a regular grid \(\{ c_g \}_{g=1}^G\) in [0, T],
where \(w_g\in \mathbb {R}\) are the mixing weights generated randomly from uniform distribution on \( [L_w,U_w]\). The kernel widths were set to a multiple of the kernel separation (distance between the two consecutive kernel centers) \(d_g\), \(r_g = \alpha _g \cdot d_g\). We used \(T=400\), \(G=80\), \(\alpha _g=3\), \(L_w =-1\) and \(U_w=1\). The synthetic rate functions were then rescaled to the interval [0.2]. Given a rate function \(\lambda (s)\), the arrival times were generated using the Thinning technique [2, 7, 11, 17, 20]. An example of a test rate function and the corresponding photon stream is shown in Fig. 1.
Using this process, we generate two photon streams from the same rate function: \(\{s_{i}^{A}\}\) and \(\{s_{j}^{B}\}\), \(i=1,2,\cdots ,S^{A}\) and \(j=1,2,\cdots ,S^{B}\). To create a pair of time shifted streams, \(s^{B}\) is shifted in time by a delay \(\varDelta >0\)
To prepare the streams for experiments, we cut the two streams to ensure they have the same start and end point in time.
We performed controlled experiments, where 100 test rate functions were generated as described in Sect. 5. For each test rate function we imposed four delay values \(\varDelta \in \{20, 22, 25, 28\}\), resulting in 400 individual experiments. The time delay trial values were taken from the interval [10, 40] with increment of 10. For each model and each imposed delay \(\varDelta \in \{20, 22, 25, 28\}\), we report the mean \(\mu \) and standard deviation \(\sigma \) of the maximum-likelihood delay estimates across the set of 100 test rate functions. We also report the mean absolute error (MAE) of the delay estimates and the 95 % Credibility Interval (CI). Summary results are presented in Table 1.
6 Conclusion
In this paper, we tested whether a more principled treatment of delay estimation in lensed photon streams, compared with the standard kernel estimation method, can have benefits of a more accurate (less biased) and/or more stable (less variance) estimation. In particular, we formulated a baseline method (KRE) based on kernel estimation of the rate function of inhomogeneous Poisson process. The delay estimate is refined using gradient descent in the delay parameter on the error functional.
A more principled delay estimation relied on imposing a single latent inhomogeneous Poisson process underlying the lensed photon streams. The rate function model was formulated as a linear combination of nonlinear basis functions, thus making the non-linear model linear in the mixing parameters. We tested this idea in the Innovation Process Based Estimation (IPE).
Somewhat surprisingly, the overall emerging picture is that the theoretically more principled methods do not bring much practical benefit in terms of the bias/variance of the delay estimation. This is in contrast to our previous findings on daily flux data [1, 4, 5]. It appears that because the underlying latent rate function is represented only implicitly through the streams of arrival times weakens the stabilizing factor of the single unified intensity function that proved so useful in the case of daily flux data [1, 4, 5]. Indeed, in that case, knowing the amount of observational noise and observing noisy flux levels gave much better clues as to what the common source variability could be, thus stabilizing the delay estimation. Nevertheless, we propose that a study of the kind is useful and necessary for future developments of alternative methods for the delay estimation in lensed photon streams.
Notes
- 1.
Note that for the delay detection task for which the KRE method is used, no such scaling was needed - the delay is invariant to scaling the estimated rate functions by the same factor.
References
Al Otaibi, S., Tiňo, P., Cuevas-Tello, J.C., Mandel, I., Raychaudhury, S.: Kernel regression estimates of time delays between gravitationally lensed fluxes. MNRAS 459(1), 573–584 (2016)
Bratley, P., Fox, B., Schrage, L.E.: A Guide to Simulation, 2nd edn. Springer, New York (1987)
Courbin, F., Chantry, V., Revaz, Y., Sluse, D., Faure, C., Tewes, M., Eulaers, E., Koleva, M., Asfandiyarov, I., Dye, S., Magain, P., van Winckel, H., Coles, J., Saha, P., Ibrahimov, M., Meylan, G.: COSMOGRAIL: the COSmological MOnitoring of GRAvItational Lenses IX. Time delays, lens dynamics and baryonic fraction in HE 0435–1223. Astron. Astrophys. 536, A53 (2011)
Cuevas-Tello, J.C., Tiňo, P., Raychaudhury, S.: How accurate are the time delay estimates in gravitational lensing? Astron. Astrophys. 454, 695–706 (2006)
Cuevas-Tello, J.C., Tiňo, P., Raychaudhury, S., Yao, X., Harva, M.: Uncovering delayed patterns in noisy and irregularly sampled time series: an astronomy application. Pattern Recogn. 43(3), 1165–1179 (2009)
Fassnacht, C.D., Xanthopoulos, E., Koopmans, L.V.E., Rusin, D.: A determination of H\(_{0}\) with the CLASS gravitational lens B1608+656 III. A significant improvement in the precision of the time delay measurements. Astrophys. J. 581, 823–835 (2002)
Fathi-Vajargah, B., Khoshkar-Foshtomi, H.: Simulating nonhomogeneous poisson point process based on multi criteria intensity function and comparison with its simple form. J. Math. Comput. Sci. (JMCS) 9(3), 133–138 (2014)
Greene, Z.S., Suyu, S.H., Treu, T., Hilbert, S., Auger, M.W., Collett, T.E., Marshall, P.J., Fassnacht, C.D., Blandford, R.D., Bradač, M., Koopmans, L.V.E.: Improving the precision of time-delay cosmography with observations of galaxies along the line of sight. Astrophys. J. 768(1), 39 (2013)
Hainline, L.J., Morgan, C.W., Beach, J.N., Kochanek, C.S., Harris, H.C., Tilleman, T., Fadely, R., Falco, E.E., Le, T.X.: A new microlensing event in the doubly imaged Quasar Q 0957+561. Astrophys. J. 744(2), 104 (2012)
Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Math. Intell. 27(2), 83–85 (2005)
Lewis, P.A., Shedler, G.S.: Simulation of nonhomogeneous poisson processes by thinning. Nav. Res. Logistics Q. 26(3), 403–413 (1979)
Nawrot, M., Aertsen, A., Rotter, S.: Single-trial estimation of neuronal firing rates: from single-neuron spike trains to population activity. J. Neurosci. Meth. 94(1), 81–92 (1999)
Park, B.U., Marron, J.S.: Comparison of data-driven bandwidth selectors. J. Am. Stat. Assoc. 85(409), 66–72 (1990)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Rasch, G.: The poisson process as a model for a diversity of behavioral phenomena. In: International Congress of Psychology, vol. 2, p. 2 (1963)
Refsdal, S.: On the possibility of determining Hubble’s parameter and the masses of galaxies from the gravitational lens effect. MNRAS 128, 307 (1964)
Ross, S.M.: Introduction to Probability Models. Academic press, Boston (2014)
Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method, vol. 707. Wiley, New York (2011)
Shimazaki, H., Shinomoto, S.: Kernel bandwidth optimization in spike rate estimation. J. Comput. Neurosci. 29(1–2), 171–182 (2010)
Sigman, K.: Poisson processes and compound (batch) poisson processes. Lecture Notes. Columbia University, USA (2007). http://www.columbia.edu/ks20/4703-Sigman/4703-07-Notes-PP-NSPP.pdf
Suyu, S.H., Auger, M.W., Hilbert, S., Marshall, P.J., Tewes, M., Treu, T., Fassnacht, C.D., Koopmans, L.V.E., Sluse, D., Blandford, R.D., Courbin, F., Meylan, G.: Two accurate time-delay distances from strong lensing: implications for cosmology. Astrophys. J. 766(2), 70 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Otaibi, S.A., Tiňo, P., Raychaudhury, S. (2016). Probabilistic Modelling for Delay Estimation in Gravitationally Lensed Photon Streams. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2016. IDEAL 2016. Lecture Notes in Computer Science(), vol 9937. Springer, Cham. https://doi.org/10.1007/978-3-319-46257-8_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-46257-8_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46256-1
Online ISBN: 978-3-319-46257-8
eBook Packages: Computer ScienceComputer Science (R0)