Description and Recognition of Activity Patterns Using Sparse Vector Fields

Portêlo, Ana; Cavallaro, Andrea; Barata, Catarina; Marques, Jorge S.

doi:10.1007/978-3-030-31332-6_21

Ana Portêlo¹²,
Andrea Cavallaro¹³,
Catarina Barata¹⁴ &
…
Jorge S. Marques¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11867))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1500 Accesses
1 Citations
2 Altmetric

Abstract

Far-field activities represented as time series or trajectories can be summarized in compact representations of frequent patterns. Popular representations such as clustering or probabilistic modeling of trajectories often do not inform about both velocity and direction of motion, which are by definition visually and quantitatively embedded in vector fields. However, a common use of vector fields may dismiss information about forbidden areas, or regions with concurrent activity patterns. To address this problem we present a non-iterative layered vector field estimation process that yields sparse vector field abstractions of activity patterns from groups of trajectories. The key feature of our approach is the estimate of the probability density function (PDF) of targets positions: it automatically tunes the cost function parameter, and serves as weights in the sparse estimation problem. We also propose a trajectory labeling algorithm that labels trajectories according to their activity patterns using the vector field abstractions. Experiments in synthetic and real trajectory data show that the proposed estimation approach yields correctly sparse vector fields, which are similar to known generating vector fields, and 5–12% higher labeling accuracy on test trajectories when compared to other generative models. Outlier trajectories are also detected.

This work was partially supported by FCT under project SPARSIS - Sparse Modeling and Estimation of Vector Fields, contract PTDC/EEIPRO/0426/2014 and pluriannual funding UID/EEA/50009/2019 and UID/CEC/50021/2019 (INESC-ID).

You have full access to this open access chapter, Download conference paper PDF

Multi-agent detection and labelling of activity patterns

Article 27 February 2020

Efficient Optimization Algorithm for Space-Variant Mixture of Vector Fields

Evaluating and Extending Trajectory Features for Activity Recognition

Keywords

1 Introduction

Far-field activities can be described by trajectories [4, 10], their spatial and angular features [1, 9], or by generative models [2, 6, 11, 12]. A high-level description of the spatial distribution of frequent activity patterns can support anomaly detection [8] and accessibility planning [5], and encode semantic regions of the scene [13].

In the literature, both diffusion and probabilistic models have been used to describe frequent activity patterns and to label trajectories according to activity patterns. On the one hand, diffusion models such as optical flow models have been used to describe coherent motions and semantic regions [14], and heat maps based on thermal diffusion processes have been used to capture the temporal motion information of activities [7]. On the other hand, probabilistic models such as Hidden Markov Models have been used to detect the points of interest in a scene [11], and Dirichlet Process Mixture Models have been used to robustly label new trajectories according to their activity patterns [6]. These studies either focused on attributing a single semantic meaning to specific spatial regions [11, 14], or on labelling trajectories according to their activity patterns [6, 7]. However, neither of them provide straightforward information about velocity and direction of the activity patterns, and only the latter are able to describe different global activity patterns in the same region of the scene. Typically, this type of motion information is associated to vector fields which embed both the physical meaning of motion (i.e., velocity and direction) and the semantic interpretation of different regions of the scene [5, 12].

We propose to (i) describe multiple, complex activity patterns using layered vector fields and (ii) label test trajectories according to activity patterns using the estimated vector fields. Our approach extends that of previous studies in that: (a) it imposes data-driven sparsity to the vector field abstractions to prevent erroneous extrapolations in regions with no target data (contrarily to [5, 12]); (b) it is sensitive to concurrent activity patterns (contrarily to [11, 14]); and (c) it provides information on the velocity and direction of activity patterns (contrarily to [6, 7]).

The proposed vector field estimation uses a cost function specifically designed to yield sparse estimates. Contrarily to other studies, which induce sparsity of the vector field estimates through the $l_1$-norm [3], this work does so through statistical conditioning on available data – the estimated spatial Probability Density Function (PDF) of the targets positions restricts vector field estimates to the regions where targets are observed. The proposed cost function further benefits from automatic parameter tuning using targets positions and trajectory features. Layered vector field abstractions can be obtained if the proposed approach is applied on pre-clustered trajectories with similar activity patterns. We assess the accuracy of the vector field abstractions of synthetic trajectories by comparing the estimated and generating vector fields, and the correct sparsity by comparing the estimated vector fields in regions with no target data with the null vector field. Vector field comparisons focus on the mean vector length (RMSL) and the vector similarity coefficient (R) [16].

Moreover, we propose a trajectory labeling algorithm according to activity patterns. The displacement error between test trajectories and generated trajectories using the estimated vector fields is the measure for classification. This way, test trajectories are sorted according to activity patterns or detected as outliers. We assess the accuracy of the trajectory labeling algorithm by comparing the attributed and the observed activity pattern labels of test trajectories.

2 Estimation of Multiple Vector Fields

We first aim to estimate the vector field, $\mathbf {T}$, that describes the activity patterns of a set of S trajectories, $\mathcal {X}= \{ \mathbf {x}_{1}, \dots , \mathbf {x}_{S} \}$. Let $t=1, \dots , L_s$ and the target position, $\mathbf {x}_s(t)$, in the image plane of a camera be driven by $\mathbf {T}$ according to

$$\begin{aligned} \mathbf {x}_s(t)= \mathbf {x}_s(t-1) + \mathbf {T}(\mathbf {x}_s(t-1)) + \mathbf {w}_s(t), \end{aligned}$$

(1)

where $\mathbf {w}_s(t) \sim \mathcal {N}(\mathbf {0}, \sigma ^2 \, \mathbf {I}), \forall \,t$, is a white random perturbation. Let the image plane be normalized, thus $\mathbf {x}_s(t) \in [0, 1]^2\, \forall \,t$.

The vector field, $\mathbf {T}:[0, 1]^2 \rightarrow \mathbb {R}^2$, is defined only at the grid nodes of an over-imposed regular, uniform grid, $\mathcal {G}= \{ \mathbf {g}_n \in [0, 1]^2, n= 1, 2, \dots , N \}$, on the image plane. As the target trajectories can be defined in any image coordinate, even if it does not correspond to a grid node ($\mathbf {x}_s(t) \notin \mathcal {G}$), we bilinearly interpolate to represent the vector field that drives the target position on any coordinate of the image plane, $\mathbf {x}_s(t) \notin \mathcal {G}$:

$$\begin{aligned} \mathbf {T}(\mathbf {x}_s(t)) = \sum _{n=1}^{N} \mathbf {\phi }_n(\mathbf {x}_s(t))\, \mathbf {t}_n , \end{aligned}$$

(2)

where $\mathbf {\phi }_n(\mathbf {x}_s(t))$ are the interpolation coefficients of the velocity vectors, $\mathbf {t}_n$, at the grid nodes. The matrix of interpolation coefficients for set $\mathcal {X}$ is $\mathbf {\varPhi }$.

Vector field estimation corresponds to an optimization problem where $\mathbf {T}$ is the minimizer of a given cost function that has to induce data-driven sparsity. To impose sparsity of the vector field estimates in the regions where target data does not exist, the velocity vectors in $\mathbf {T}$ are weighted by 1 minus the spatial probability density function (PDF) of targets positions, i.e., $\mathcal {D}=\mathbbm {1}-\varGamma _p$, $\varGamma _p \in \mathbb {R}^{N}$. $\varGamma _p$ is the estimated PDF of the targets positions using the Parzen window algorithm over set $\mathcal {X}$. Then, to get its value at the grid nodes, we discretize at the desired image coordinates (Fig. 1).

The cost function is therefore defined as

$$\begin{aligned} f(\mathbf {T}) = \Vert \mathbf {V} - \mathbf {T}\,\mathbf {\varPhi } \Vert _2^2 + \alpha \, \Vert \mathbf {T}\circ \mathbbm {1}\mathcal {D}^{\top } \Vert _2^2, \end{aligned}$$

(3)

where $\mathbbm {1}$ is of size $[2 \times 1]$, $\Vert .\, \Vert _2$ defines the $l_2$-norm of a vector, “$\circ $" represents the Hadamard product, and $\mathbf {T} \in \mathbb {R}^{2\times N}$, $\mathbf {V} \in \mathbb {R}^{2\times M}$, $\mathbf {\varPhi } \in \mathbb {R}^{N\times M}$, $M= \sum _{s=1}^{S}(L_s - 1)$, are given by

$$\begin{aligned} \mathbf {T}&= \begin{bmatrix} \mathbf {t}_1&\;\dots&\;\mathbf {t}_N \end{bmatrix}, \end{aligned}$$

(4)

$$\begin{aligned} \mathbf {V}&= \begin{bmatrix} \mathbf {v}_1(2) \dots \mathbf {v}_1(L_1)\;&|&\dots&|&\mathbf {v}_S(2) \dots \mathbf {v}_S(L_S) \end{bmatrix}, \end{aligned}$$

(5)

$$\begin{aligned} \mathbf {\varPhi }&= \begin{bmatrix} \begin{array} {c} \mathbf {\phi }_1(\mathbf {x}_1(1)) \dots \mathbf {\phi }_1(\mathbf {x}_1(L_1 -1)) \\ \vdots \\ \mathbf {\phi }_N(\mathbf {x}_1(1)) \dots \mathbf {\phi }_N(\mathbf {x}_1(L_1 -1)) \end{array} &{} \Bigg |&{} \dots &{} \Bigg |&{} \begin{array}{c} \mathbf {\phi }_1(\mathbf {x}_S(1)) \dots \mathbf {\phi }_1(\mathbf {x}_S(L_S -1)) \\ \vdots \\ \mathbf {\phi }_N(\mathbf {x}_S(1)) \dots \mathbf {\phi }_N(\mathbf {x}_S(L_S -1)) \end{array} \end{bmatrix}, \end{aligned}$$

(6)

with matrix $\mathbf {V}$ composed of the velocity vectors between consecutive target positions, i.e., $\mathbf {v}_s(t)= \mathbf {x}_s(t) - \mathbf {x}_s(t-1)$.

In Eq. (3), $\alpha $ is correlated with the grid resolution N. To avoid manual parameter input in every estimation procedure, we propose an automatic tuning based on the cardinality of the non-zero elements (i.e., $|\;\cdot \; |_{\ne 0}$), expected value (i.e., $\mathbb {E}[\;\cdot \;]$), and standard deviation [i.e., $\sigma (\,\cdot \,)$] of the estimated PDFs of target and trajectory features,

$$\begin{aligned} \alpha&= 1- \frac{|\varGamma _p |_{\ne 0}}{N}, \end{aligned}$$

(7)

$$\begin{aligned} N&= \max \big \{N_{\text {min}}, |\varGamma _c |_{\ne 0} > \mathbb {E}[\varGamma _c]+ 1.5\, \sigma (\varGamma _c)\big \}, \end{aligned}$$

(8)

where $\varGamma _p$ is the spatial PDF of the target positions as before; $\varGamma _c \in \mathbb {R}^N$ is the average curvature of the trajectories at the grid nodes, which is estimated using the velocity angles $\theta (t)= \tan ^{-1}\Big (\frac{y(t)-y(t-1)}{x(t)-x(t-1)}\Big )$ [1, 8]; and $N_{\text {min}}$ is the minimum grid resolution selected by the user. In (8), very curly trajectories (i.e., extreme values of the distribution of $\varGamma _c$) define the grid resolution.

Multiple vector fields can be estimated using (3) if it is applied to each set of pre-clustered trajectories ($\mathcal {X}_k$) with similar activity patterns, e.g. using multiple features [1]. In the following, we assume that the pre-clustering step has taken place and that we have access to the sets $\mathcal {X}_k$.

3 Activity Pattern Labeling

Our second aim is to label trajectories according to their activity patterns. To achieve this aim, we first estimate the $\mathbf {T}_k$ following the above approach and using only trajectories from training sets $\mathcal {X}_k$. Then, we propose the following labeling algorithm:

1.
Trajectory labeling:
1. (a)
  Generate trajectories from the starting point of a given test trajectory using the estimated $\mathbf {T}_k$ and (1);
2. (b)
  Compute the displacement error as the euclidean distance between the generated and the test trajectories;
3. (c)
  Label each test trajectory with the activity pattern (vector field abstraction) that yields the smallest displacement error.
2.
Outlier detection using threshold:
1. (a)
  Compute the cutoff threshold as the sum of the median and the median absolute deviation (MAD) of the displacement errors obtained from steps 1.(a) and 1.(b) applied on a set of validation trajectories (Fig. 2);
2. (b)
  Label test trajectories as outliers of the labeled activity pattern from step 1.(c) if their displacement error is above the threshold.

In the Outlier detection step above, the cutoff threshold for outlier detection is the sum of the median and median absolute deviation of the displacement errors. We use the median and its absolute deviation instead of the mean and standard deviation given that the distribution of displacement errors is right skewed.

4 Experimental Results

4.1 Synthetic Data

Assessment Measures. Estimates of vector fields using synthetic data ($\mathbf {T}_{\text {est}}$) are assessed regarding both the accuracy when compared to the known generating vector field ($\mathbf {T}_{\text {ref}}$) and the correct sparsity compared to the null vector field ($\mathbf {T}_{0}$) in regions where no target data is observed. Let each node on the over-imposed grid be labeled according to its proximity to a given trajectory as an active node, if it belongs to a square of nodes containing part of a given trajectory, or a non-active node, if it does not belong to such a square of nodes. Thus, the region where no target data is observed is defined as the set of non-active nodes in the image plane, $\mathcal {Z}$, with respect to a given trajectory set.

The assessment measures compare pairs of vectors regarding the vector similarity coefficient (R), i.e., the mean of the inner product of normalized vector pairs from 2 vector fields A and B, defined as [16],

$$\begin{aligned} R = \frac{1}{|\mathcal {P} |} \sum _{i \in \mathcal {P}} \hat{\mathbf {t}}^A_i \cdot \hat{\mathbf {t}}^B_i\,; \end{aligned}$$

(9)

and the vector root mean square length (RMSL), i.e., the systematic difference in the mean vector length, defined as [16],

$$\begin{aligned} \text {RMSL} = L_V^2 = \frac{1}{|\mathcal {P} |} \sum _{i \in \mathcal {P}} \Big \Vert \, \mathbf {t}^A_i - \mathbf {t}^B_i \, \Big \Vert _2^2\,; \end{aligned}$$

(10)

where “$\cdot $" is the inner product, $\hat{\mathbf {t}} = \frac{\mathbf {t}}{\sqrt{\Vert \mathbf {t} \Vert _2^2}}$, $\Vert \,.\, \Vert _2$ represents the $l_2$-norm of a vector, and $|\,.\, |$ represents the cardinality of a set. In the case of accuracy assessment, $A= \mathbf {T}_{\text {ref}}$, $B= \mathbf {T}_{\text {est}}$, and $\mathcal {P} = \mathcal {G}$, the set of grid nodes. In the case of correct sparsity assessment, $A= \mathbf {T}_{0}$, $B= \mathbf {T}_{\text {est}}$, and $\mathcal {P} = \mathcal {Z}$, the set of non-active nodes. The optimal values for these measures are (R, RMSL)$= (1,1)$ and RMSL$=0$, respectively for accuracy and sparsity assessments.

Data Set. The synthetic data set ($D_0$) has 300 trajectories generated using 6 different activity patterns. We use $D_0$ as a proof of concept for the assessment of accuracy and correct sparsity of the estimated vector fields.

Results. Figure 3 shows that the vector field estimates are very similar to the generating vector fields not only in terms of magnitude (RMSL) and direction (R) but also regarding sparsity in the regions with no data, as expected. The accuracy of the estimated vector fields (activity patterns) is respectively (R, RMSL) 1: (0.920, 0.663); 2: (0.952, 2.305); 3: (0.982, 0.979); 4: (0.973, 0.723); 5: (0.954, 9.787); 6: (0.968, 0.763), and the RMSL of the vector field estimates corresponds to sparse vector fields, i.e., all bellow 2.94e−04.

4.2 Real Data

Assessment Measures. Activity pattern labeling accuracy is computed by comparing attributed and known trajectory labels taking into account the cutoff threshold as

$$\begin{aligned} Acc= \frac{\sum \text {diag}(\mathcal {M})}{\sum _{ij} \mathcal {M}_{ij}}\,, \end{aligned}$$

(11)

where $\mathcal {M}$ is the confusion matrix in a problem with multiple activity patterns.

Data Set. The real data sets we used are: $D_1$ (Hu), containing 1500 trajectories with 15 activity patterns [6]; $D_2$ (Wang), containing 220 trajectories with 11 activity patterns [14]; $D_3$ (Morris), containing 1900 trajectories with 19 activity patterns [11]. We use these data sets to assess activity pattern labeling and outlier detection.

Table 1. Overall accuracy of trajectory labeling for the proposed approach and comparison with literature results [6, 11, 14].

Full size table

Results. Table 1 shows that overall the proposed algorithm correctly labels trajectories according to their activity patterns with an accuracy above that described in the literature. More specifically with higher accuracy than Heat-map, HMM, and DPMM, which are comparable generative models used to describe activity patterns [6, 11, 14, 15].

Regarding outlier detection, note that the proposed algorithm always assigns an activity pattern to a trajectory – the attributed activity pattern is the one that generates trajectories with the smallest displacement error relative to the test trajectory. However, if the displacement error is above the threshold the respective trajectory is plotted in a different colour than the others and tagged as an outlier. Figure 4 shows examples of 2 activity patterns for each real data set, which have similar motion patterns but different semantic meanings. Concerning Activity Pattern I, only $D_1$ has outlier trajectories from two other activity patterns (shown in different colors, Fig. 4 middle row). Concerning Activity Pattern II, all the data sets have outlier trajectories from Activity Pattern I, and $D_1$ also has outlier trajectories form one additional activity pattern (Fig. 4 bottom row).

The proposed approach yields vector field abstractions that can distinguish between similar activity patterns with different underlying semantics, given that for each data set, trajectories which were wrongly labelled as having one activity pattern were correctly detected as outliers of that activity pattern. For example, the green outlier trajectories from $D_1$ (Fig. 4 bottom left panel) are detected as outliers of that activity pattern – whereas the vector field of interest describes a left turn into the primary road, the outlier trajectories correspond to targets that instead performed a left turn into a secondary road. Similar examples for the other two data sets are shown in the bottom row of Fig. 4.

5 Conclusion

We proposed a vector field estimation approach that copes with dense trajectory data and yields compact abstractions of frequent activity patterns. The proposed approach abstracts frequently observed activity patterns and embeds data-driven sparsity, through the estimated spatial Probability Density Function (PDF) of the targets positions. Moreover, it informs about the physical and semantic meaning of the observed activity patterns. Finally, the estimated vector fields can be used to label new trajectories and detect outliers according to their activity pattern, with an improvement of about 5–12% on trajectory labeling accuracy when compared to other generative models.

References

Anjum, N., Cavallaro, A.: Multifeature object trajectory clustering for video analysis. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1555–1564 (2008)
Article Google Scholar
Barão, M., Marques, J.S.: Gaussian random vector fields in trajectory modelling. In: Proceedings of the 19th Irish Machine Vision and Image Processing Conference (IMVIP), pp. 211–216 (2017)
Google Scholar
Barata, C., Nascimento, J.C., Marques, J.S.: A sparse approach to pedestrian trajectory modeling using multiple motion fields. In: IEEE International Conference on Image Processing (ICIP), pp. 2538–2542 (2017)
Google Scholar
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)
Article Google Scholar
Ferreira, N., Klosowski, J.T., Scheidegger, C.E., Silva, C.T.: Vector field k-means: clustering trajectories by fitting multiple vector fields. In: Eurographics Conference on Visualization (EuroVis), vol. 32 (2013)
Article Google Scholar
Hu, W., Li, X., Tian, G., Maybank, S., Zhang, Z.: An incremental DPMM-based method for trajectory clustering, modeling, and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1051–1065 (2013)
Article Google Scholar
Lin, W., Chu, H., Wu, J., Sheng, B., Chen, Z.: A heat-map-based algorithm for recognizing group activities in videos. IEEE Trans. Circuits Syst. Video Technol. 23(11), 1980–1992 (2013)
Article Google Scholar
Marques, J.S., Figueiredo, M.A.T.: Fast estimation of multiple vector fields: application to video surveillance. In: 7th International Symposium on Image and Signal Processing and Analysis (ISPA) (2011)
Google Scholar
Mirge, V., Verma, K., Gupta, S.: Dense traffic flow patterns mining in bi-directional road networks using density based trajectory clustering. Adv. Data Anal. Classification 11(3), 547–561 (2017)
Article MathSciNet Google Scholar
Morris, B.T., Trivedi, M.M.: A survey of vision-based trajectory learning and analysis for surveillance. IEEE Trans. Circuits Syst. Video Technol. 18(8), 1114–1127 (2008)
Article Google Scholar
Morris, B.T., Trivedi, M.M.: Trajectory learning for activity understanding: unsupervised, multilevel, and long-term adaptive approach. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2287–2301 (2011)
Article Google Scholar
Nascimento, J.C., Figueiredo, M.A.T., Marques, J.S.: Activity recognition using a mixture of vector fields. IEEE Trans. Image Process. 22(5), 1712–1725 (2013)
Article MathSciNet Google Scholar
Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: European Conference on Computer Vision (ECCV) (2016)
Google Scholar
Wang, W., Lin, W., Chen, Y., Wu, J., Wang, J., Sheng, B.: Finding coherent motions and semantic regions in crowd scenes: a diffusion and clustering approach. In: European Conference on Computer Vision (ECCV), pp. 756–771 (2014)
Chapter Google Scholar
Xu, H., Zhou, Y., Lin, W., Zha, H.: Unsupervised trajectory clustering via adaptive multi-kernel-based shrinkage. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Xu, Z., Hou, Z., Han, Y., Guo, W.: A diagram for evaluating multiple aspects of model performance in simulating vector fields. Geosci. Model. Dev. 9(12), 4365–4380 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
Ana Portêlo
Centre for Intelligent Sensing, Queen Mary University of London, London, UK
Andrea Cavallaro
Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
Catarina Barata & Jorge S. Marques

Authors

Ana Portêlo
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Cavallaro
View author publications
You can also search for this author in PubMed Google Scholar
Catarina Barata
View author publications
You can also search for this author in PubMed Google Scholar
Jorge S. Marques
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ana Portêlo .

Editor information

Editors and Affiliations

Universidad Autónoma de Madrid, Madrid, Spain
Aythami Morales
Universidad Autónoma de Madrid, Madrid, Spain
Julian Fierrez
Universitat Jaume I, Castellón de la Plana, Spain
José Salvador Sánchez
University of Coimbra, Coimbra, Portugal
Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Portêlo, A., Cavallaro, A., Barata, C., Marques, J.S. (2019). Description and Recognition of Activity Patterns Using Sparse Vector Fields. In: Morales, A., Fierrez, J., Sánchez, J., Ribeiro, B. (eds) Pattern Recognition and Image Analysis. IbPRIA 2019. Lecture Notes in Computer Science(), vol 11867. Springer, Cham. https://doi.org/10.1007/978-3-030-31332-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-31332-6_21
Published: 22 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31331-9
Online ISBN: 978-3-030-31332-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)