Skip to main content
Log in

LinkNet: capturing temporal dependencies among spatial regions

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Many applications require understanding how event occurrences at one geographical region affect or influence event occurrences at another region, e.g. spread of disease and forest fires. Existing works typically impose a grid to partition the spatial space and utilize spatial autocorrelation property to model the spatial dependency among the grid cells. However, they are often highly sensitive to the granularity of the grid size and they do not incorporate the temporal dynamics of the event occurrences among regions. This paper utilizes the notion of a spatial network with temporal dependency to capture the dynamics of event occurrences among regions. This network is modeled as a directed graph where each node is a group of spatially nearby events and each directed edge represents the influence of events from a source node to a destination node. We design an algorithm called LinkNet to generate this network from spatio-temporal event databases. LinkNet utilizes minimum description length based information–theoretic approach to automatically adjust the number of regions and the temporal relationships among regions. Two optimizations are devised to reduce the computational complexity of LinkNet. We also demonstrate how the proposed network can be used for hotspot prediction. Experiment results on both synthetic and real world datasets demonstrate the efficiency of LinkNet and the effectiveness of the network in predicting the next hotspots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. http://episcangis.hygiene.uni-wuerzburg.de/

  2. Note that, the time complexity of \(NodeCost(V,D)\) is \(O(|D|)\) instead of \(O(|V|)\) This is due to fact that region discovery algorithm has \(|D|\) regions at the start of algorithm and progressively number of regions are reduced by one in each iteration.

  3. http://episcangis.hygiene.uni-wuerzburg.de/

  4. http://data.octo.dc.gov//

References

  1. Agarwal, D., McGregor, A., Phillips, J., Venky, S., Zhu, Z.: Spatial scan statistics: approximations and performance study. In: ACM SIGKDD, (2006)

  2. Aggarwal, C.: A framework for change diagnosis of data streams. In: SIGMOD, pp. 575–586, (2003)

  3. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: Np-hardness of euclidean sum-of-squares clustering. Mach. Learn. 75, 245–249 (2009)

    Article  Google Scholar 

  4. Chainey, S., Tompson, L., Uhlig, S.: The utility of hotspot mapping for predicting spatial patterns of crime. Secur. J. 21, 4–28 (2008)

    Article  Google Scholar 

  5. Chang, W., Zeng, D., Chen, H.: A spatio-temporal data analysis approach based on prospective support vector clustering. Decis. Support Syst. 45, 697–713 (2008)

    Article  Google Scholar 

  6. Cho, Y., Galstyan, A., Brantingham, J., Tita, G.: Latent point process models for spatial-temporal networks. In: Bayesian Modeling Applications Workshop (2012)

  7. Dong, W., Zhang, X., Li, L., Sun, C., Shi, L., Wei, S.: Detecting irregularly shaped significant spatial and spatio-temporal clusters. In: SDM (2012)

  8. George, B., Kim, S., Shekhar, S.: Spatio-temporal network databases and routing algorithms: a summary of results. In: SSTD (2007)

  9. Gonzalez, H., Han, J., Li, X., Myslinska, M., Sondag, J.: Adaptive fastest path computation on a road network. In: VLDB (2007)

  10. Gruwald, P., Myung, J., Pitt, M.: Advances in Minimum Description Length. MIT Press, Cambridge (2005)

    Google Scholar 

  11. Huang, Y., Zhang, L., Zhang, P.: A framework for mining sequential patterns from spatio-temporal event data set. TKDE 19, 453–467 (2008)

    Google Scholar 

  12. Lee, J., Han, J., Li, X., Cheng, H.: Mining discriminative patterns for classifying trajectories on road networks. IEEE TKDE 23, 713–726 (2011)

    Google Scholar 

  13. Levine, N.: Crimestat III: a spatial statistics program for the analysis of crime incident locations (2010)

  14. Lu, W., Zheng, Y., Chawla, S., Yuan, J., Xing, X.: Discovering spatio-temporal causal interactions in traffic data streams. In: ACM SIGKDD (2011)

  15. Maciejewski, R., Hafen, R., Rudolph, S., Larew, S., Mitchell, M., Cleveland, W., Ebert, D.: Forecasting hotspots : a predictive analytics approach. IEEE Trans. Vis. Comput. Graph. 17(4), 440–453 (2011)

    Article  Google Scholar 

  16. Monreale, A., Pinelli, F., Trasarti, R., Giannotti, F.: Wherenext: a location predictor on trajectory pattern mining. In: KDD, pp. 637–646 (2009)

  17. Natalia, A., Gennady, A.: Spatial generalization and aggregation of massive movement data. IEEE Trans. Vis. Comput. Graph. 17, 205–219 (2011)

    Article  Google Scholar 

  18. Patel, D., Chang, S., Hsu, W., Lee, M.L.: Incorporating duration information for trajectory classification. In: IEEE ICDE (2012)

  19. Shekhar, S., Schrater, P., Vatsavai, R., Wu, W., Chawla, S.: Spatial contextual classification and prediction models for mining geospatial data. IEEE Trans. Multimed. 4, 174–188 (2002)

    Article  Google Scholar 

  20. Ugur, D., Kashani, F., Shahabi, C., Ranganathan, A.: Online computation of fastest path in time-dependent spatial networks. In: SSTD (2011)

  21. Zeng, D., Chen, H., Chavez, C., Lober, W., Thurmond, M.: Infectious Disease Informatics and Biosurveillance. Springer, New York (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dhaval Patel.

Appendix

Appendix

This appendix provides the mathematical proofs of the theoretical results discussed in Sect. 4.

1.1 Appendix 1: Proof for Theorem 1

Proof

$$\begin{aligned}&\sum _{p \in R} Dist(p,R)^2\nonumber \\&\quad = \sum _{p \in R} \left( (p \cdot x-R \cdot x)^2 + (p \cdot y-R \cdot y)^2\right) \nonumber \\&\quad = \sum _{p \in R} \left( p \cdot x^2+p \cdot y^2\right) + \sum _{p \in R} \left( R \cdot x^2-2(p \cdot x)(R \cdot x)\right) \nonumber \\&\qquad + \sum _{p \in R} \left( R \cdot y^2-2(p \cdot y)(R \cdot y)\right) \nonumber \\&\quad = \sum _{p \in R} \left( p \cdot x^2+p \cdot y^2\right) + \left( |R|(R \cdot x^2)-2(R \cdot x)\sum _{p \in R}(p \cdot x)\right) \nonumber \\&\qquad +\left( |R|(R \cdot y^2)-2(R \cdot y)\sum _{p \in R}(p \cdot y)\right) \nonumber \\&\quad = \sum _{p \in R} \left( p \cdot x^2+p \cdot y^2\right) + \left( -|R|(R \cdot x^2)\right) + \left( -|R|(R \cdot y^2)\right) \nonumber \\&\quad = \sum _{p \in R} \left( p \cdot x^2 + p \cdot y^2\right) - |R| \left( R \cdot x^2 + R \cdot y^2\right) \nonumber \end{aligned}$$

This proved the theorem. \(\square \)

1.2 Appendix 2: Proof for Eq. (1)

Proof

Recall, \(|V'|\) = \(|V|-1\).

$$\begin{aligned}&NodeGain(R_i, R_j) \nonumber \\&\quad NodeCost(V,D) - NodeCost(V',D) \nonumber \\&\quad = \left( L(V) + L(D|V)\right) - \left( L(V') + L(D|V')\right) \nonumber \\&\quad = \left( \mathrm{log}{|V|}-\mathrm{log}{|V'|} + k|V| - k|V'| \right) \nonumber \\&\qquad + \left( |R_i|\mathrm{log}\frac{|D|}{|R_i|}+|R_j|\mathrm{log}\frac{|D|}{|R_j|}-|R_{ij}|\mathrm{log}\frac{|D|}{|R_{ij}|}\right) \nonumber \\&\qquad +\, \frac{1}{2\sigma ^2\mathrm{ln}2} \left( \sum _{p \in R_i} Dist(p,R_i)^2 + \sum _{p \in R_j} Dist(p,R_j)^2 - \sum _{p \in R_{ij}} Dist(p,R_{ij})^2\right) \nonumber \\ \end{aligned}$$
(5)

The sum of squares of the distances in Eq. (5) can be simplified as follows:

$$\begin{aligned}&\sum _{p \in R_i} Dist(p,R_i)^2 + \sum _{p \in R_j} Dist(p,R_j)^2 - \sum _{p \in R_{ij}} Dist(p,R_{ij})^2 \nonumber \\&\quad = \left( |R_{ij}|(R_{ij} \cdot x^2)-|R_{i}|(R_{i} \cdot x^2)-|R_{j}|(R_{j} \cdot x^2)\right) \nonumber \\&\qquad + \left( |R_{ij}|(R_{ij} \cdot y^2)-|R_{i}|(R_{i} \cdot y^2)-|R_{j}|(R_{j} \cdot y^2)\right) \nonumber \\&\quad =\left( \frac{\left( \sum _{p \in R_i} p \cdot x + \sum _{p \in R_j} p \cdot x\right) ^2}{|R_{ij}|}-|R_{i}|(R_{i} \cdot x^2)-|R_{j}|(R_{j} \cdot x^2)\right) \nonumber \\&\qquad +\left( \frac{\left( \sum _{p \in R_i} p \cdot y + \sum _{p \in R_j} p \cdot y\right) ^2}{|R_{ij}|}-|R_{i}|(R_{i} \cdot y^2)-|R_{j}|(R_{j} \cdot y^2)\right) \nonumber \\&\quad =\left( \frac{|R_i|^2|R_j|^2}{|R_{ij}|} \left( \frac{R_i \cdot x}{|R_j|}+\frac{R_j \cdot x}{|R_i|}\right) ^2 -|R_{i}|(R_{i} \cdot x^2)-|R_{j}|(R_{j} \cdot x^2) \right) \nonumber \\&\qquad +\left( \frac{|R_i|^2|R_j|^2}{|R_{ij}|} \left( \frac{R_i \cdot y}{|R_j|}+\frac{R_j \cdot y}{|R_i|}\right) ^2 -|R_{i}|(R_{i} \cdot y^2)-|R_{j}|(R_{j} \cdot y^2) \right) \nonumber \\&\quad =\frac{|R_i|^2|R_j|^2}{|R_{ij}|} \left( -\frac{R_i \cdot x^2}{(|R_i|)(|R_j|)}-\frac{R_j \cdot x^2}{(|R_i|)(|R_j|)}+\frac{2(R_i \cdot x)(R_j \cdot x)}{(|R_i|)(|R_j|)}\right) \nonumber \\&\qquad +\frac{|R_i|^2|R_j|^2}{|R_{ij}|} \left( -\frac{R_i \cdot y^2}{(|R_i|)(|R_j|)}-\frac{R_j \cdot y^2}{(|R_i|)(|R_j|)}+\frac{2(R_i \cdot y)(R_j \cdot y)}{(|R_i|)(|R_j|)}\right) \nonumber \\&\qquad = -\frac{|R_i||R_j|}{|R_{ij}|} \left( R_i \cdot x-R_j \cdot x\right) ^2 - \frac{|R_i||R_j|}{|R_{ij}|} \left( R_i \cdot y-R_j \cdot y\right) ^2 \nonumber \\&\qquad = -\frac{|R_i||R_j|}{|R_{ij}|} \left( (R_i \cdot x-R_j \cdot x)^2+(R_i \cdot y-R_j \cdot y)^2\right) \nonumber \\&\qquad =-\frac{|R_i||R_j|}{|R_{ij}|} Dist(R_i,R_j)^2 \nonumber \\&\qquad =-\frac{|R_i||R_j|}{|R_i|+|R_j|} Dist(R_i,R_j)^2 \end{aligned}$$
(6)

By substituting Eq. (6) into Eq. (5), we have

$$\begin{aligned}&NodeGain(R_i, R_j)\nonumber \\&\quad =\left( \mathrm{log}\frac{|V|}{|V'|} + k \right) - \left( \frac{|R_i| |R_j| Dist(R_i,R_j)^2}{(2\sigma ^2\mathrm{ln}2) (|R_i| + |R_j|)}\right) \nonumber \\&\qquad +\left( |R_{ij}| \mathrm{log}(|R_{ij}|) - |R_i| \mathrm{log}(|R_i|) - |R_j| \mathrm{log}(|R_j|) \right) \end{aligned}$$
(7)

This proved the equation. \(\square \)

1.3 Appendix 3: Proof for Theorem 3

We have used following lemma in Theorem 3 to show that \(maxNodeGain(R_i,R_j)\) is an upperbound estimation of \(NodeGain(R_i,R)\).

Lemma 1

Let \(N\) be a positive integer and \(x\) is a variable. The function \(f:[1,\infty ] \rightarrow [1,\infty ]\), \(f(x)=(N+x)\mathrm{log}(N+x)-x\mathrm{log}(x)-N\mathrm{log}N\) is monotonic increasing.

Proof

To prove that \(f\) is monotonic increasing, we will show that the first derivative of \(f\) is greater than 0.

$$\begin{aligned} \frac{df(x)}{dx}&= (N+x)\left( \frac{1}{N+x}\right) +\mathrm{log}(N+x)(1)-x\left( \frac{1}{x}\right) -\mathrm{log}(x)\\&= 1+\mathrm{log}(N+x)-1-\mathrm{log}(x) \\&= \mathrm{log}\left( \frac{N}{x}+1\right) \\&= \mathrm{log}(N+q)-\mathrm{log}(q) \\&= \mathrm{log}\left( \frac{N}{x}+1\right) \\&> 0 \end{aligned}$$

Clearly, \(\frac{df(x)}{dx}\) \(>\) 0. This proved the equation. \(\square \)

Now, we prove Theorem 3 that is discussed in Sect. 5.1.2.

Proof

Recall that \(max\_size\) is the maximum size of any region that appears after and including region \(R_j\) on the orderline. This implies \(max\_size\) \(\ge \) \(|R|\). From Lemma 1,

$$\begin{aligned}&(|R_i|+max\_size) \mathrm{log}(|R_{i}|+max\_size) \\&\qquad -\left( |R_i| \mathrm{log}(|R_i|) + (max\_size) \mathrm{log}(max\_size)\right) \\&\quad \ge (|R_i|+|R|) \mathrm{log}(|R_{i}|+|R|) - \left( |R_i| \mathrm{log}(|R_i|) + (|R|) \mathrm{log}(|R|)\right) \end{aligned}$$

Since \(min\_size \le |R|\) and \(ndist_{ij} \le Dist(R_i,R)\), we have

$$\begin{aligned} \frac{(ndist_{ij}^2) (min\_size) |R_i|}{(2\sigma ^2 \mathrm{ln}2) (|R_i| + max\_size)} \le \frac{\left( Dist(R_i,R)^2\right) |R||R_i|}{(2\sigma ^2 \mathrm{ln} 2) (|R_i| + |R|)} \end{aligned}$$

Therefore,

$$\begin{aligned}&maxNodeGain(R_i,R_j) = \left( \mathrm{log}\frac{|V|}{|V|-1} + k \right) \\&\qquad +\,(|R_i|+max\_size) \mathrm{log}(|R_{i}|+max\_size) \\&\qquad -\,\left( |R_i| \mathrm{log}(|R_i|) + (max\_size) \mathrm{log}(max\_size)\right) \\&\qquad -\, \frac{(ndist_{ij}^2) (min\_size) |R_i|}{2\sigma ^2 \mathrm{ln} 2 (|R_i| + max\_size) } \\&\quad \ge \, \left( \mathrm{log}\frac{|V|}{|V|-1} + k \right) - \frac{|R||R_i|(Dist(R_i,R)^2)}{(2\sigma ^2 \mathrm{ln} 2) (|R_i| + |R|)} \\&\qquad +\, (|R_i|+|R|) \mathrm{log}(|R_{i}|+|R|) - \left( |R_i| \mathrm{log}(|R_i|) + (|R|) \mathrm{log}(|R|)\right) \\&\quad = NodeGain(R_i,R) \end{aligned}$$

This proved the theorem. \(\square \)

1.4 Appendix 4: Proof for Eq. (3)

Proof

$$\begin{aligned}&EdgeCost(E,D) \!-\! EdgeCost(E',D) \!=\! \left( L(E) \!+\! L(Z|E)\right) \!-\! \left( L(E') + L(Z|E')\right) \nonumber \\&\quad = \left( \mathrm{log}{|E|}-\mathrm{log}{|E'|} + 2k|E| - 2k|E'| \right) \nonumber \\&\qquad + \left( |e_i|\mathrm{log}\frac{|E|}{|e_i|}+|e_j|\mathrm{log}\frac{|E|}{|e_j|}-|e_{ij}|\mathrm{log}\frac{|E|}{|e_{ij}|}\right) \nonumber \\&\qquad +\, \frac{1}{2\sigma ^2\mathrm{ln}2}\left( \sum _{z \in e_i} Dist(z,e_i)^2 + \sum _{z \in e_j} Dist(z,e_j)^2 + \sum _{z \in e_{ij}} Dist(z,e_{ij})^2 \right) \qquad \end{aligned}$$
(8)

We can simplify the sum of squares of distances in Eq. (8):

$$\begin{aligned} \sum _{z \in e_i} Dist(z,e_i)^2 \!+ \!\sum _{z \in e_j} Dist(z,e_j)^2 \!-\! \sum _{z \in e_{ij}} Dist(z,e_{ij})^2 \!=\! \frac{|e_i| |e_j|}{|e_i| + |e_j|} Dist(e_i,e_j)^2\nonumber \\ \end{aligned}$$
(9)

By substituting Eq. (9) into Eq. (8), we have

$$\begin{aligned}&EdgeCost(E,D) - EdgeCost(E',D)\\&\quad = \left( \mathrm{log}\frac{|E|}{|E'|} + 2k\right) + \left( |e_{ij}|\mathrm{log}(|e_{ij}|) - |e_i| \mathrm{log}(|e_i|) - |e_j| \mathrm{log}(|e_j|)\right) \\&\qquad -\left( |e_i| |e_j| \frac{\left( Dist(e_i,e_j)^2\right) }{(2 \sigma ^2 \mathrm{ln}2)(|e_i| + |e_j|) } \right) \end{aligned}$$

This proved the equation. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patel, D., Hsu, W. & Lee, M.L. LinkNet: capturing temporal dependencies among spatial regions. Distrib Parallel Databases 33, 165–200 (2015). https://doi.org/10.1007/s10619-014-7147-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-014-7147-9

Keywords

Navigation