Interpreting RFID tracking data for simultaneously moving objects: An offline sampling-based approach
Introduction
A possible way of exploiting the RFID technology for tracking moving objects is the so called tag tracking paradigm: the target objects are equipped with a tag (emitting radio signals encoding identifying information), while the locations of the premises where the objects move are equipped with readers (whose antennas can detect the tags’ signals). Then, given a set of objects simultaneously monitored over a time interval the result of the tracking is a set where each Θi is the sequence of readings collected for oi. That is, where each is the (possibly empty) set of readers that detected oi at time point τ.
Currently, RFID technology is one of the most used infrastructure-based solutions adopted in indoor tracking systems. The reason of its popularity is its versatility: once the infrastructure is installed, RFID technology can be used both for: 1) monitoring objects, such as products on supermarkets’ shelves, medical devices and tools in hospitals, and fixed assets (e.g., laptops, smart devices, books, furniture, labs equipment) in every kind of organizations; 2) monitoring people, such as customers in supermarkets/malls, patients and personnel in hospitals, employees and visitors in organizations. In fact, there are several scenarios (like that of a mall, that we will consider in our experiments) where the RFID infrastructure is first installed to monitor fixed assets and shelves, and then exploited also to track people. Once the readers are installed (possibly for different purposes), tracking people becomes easy and cheap: it suffices to equip each person with a passive tag, whose cost does not exceed a few dollar cents. This versatility and adaptability is not shared by other technologies: for instance, smartphone-based tracking can be used for people, but is not suitable for objects. This explains the interest of research and industry in investigating RFID based solutions for people tracking: GAO RFID and Flexiray are examples of companies providing RFID systems for people and personnel tracking. The interest in RFID-based tracking systems is also certified by the recent MarketsandMarkets analysis report in Indoor Location Market (2019), where the market of indoor positioning (of which RFID-based solutions constitute a large portion) has been estimated to become $41 billion worth by 2022. This has led the research to explicitly address the management of the data collected by indoor RFID tracking systems, as done in Bai, Wang, Liu, Zaniolo, and Liu (2007); Chawathe, Krishnamurthy, Ramachandran, and Sarma (2004); Fazzinga, Flesca, Furfaro, and Masciari (2013); Fazzinga, Flesca, Masciari, and Furfaro (2009); Gonzalez et al. (2010); Gonzalez, Han, Li, and Klabjan (2006); Lee and Chung (2008), and in particular the issue of interpreting these data.
The interpretation problem. The set of readings generally admits different interpretations, i.e., ways of being matched to the locations of the map. Formally, an interpretation for is a hypothesis on where the objects were during I, i.e., it is a list where each ti is a sequence of locations representing a trajectory for oi. The point is that we cannot determine each (the position of oi at τ) only on the basis of (the readers that detected oi at τ). In fact, a one-to-one correspondence between locations and readers is infrequent: the same location may contain zones covered by different readers, and the same reader may cover different locations. Moreover, false negatives may occur: an object close to a reader may not be detected, owing to malfunctions. For instance, in Fig. 1, an object in the zone of l0 covered by r0 and r4 may be detected by both r0 and r4, or by only one of them, or by none.
However, not all the possible interpretations of are “realistic”: some trajectories, though compatible with the readings, cannot have been simultaneously followed by the objects. Example 1 shows how this can happen, and how integrity constraints can help detect the unrealistic interpretations. Example 1 Consider Fig. 1 and two people monitored over . The collected readings are: where . Assume that, for each τ ∈ I, (i.e., o1, o2 were detected by the same readers at each time point). Each is reported below, along with the set Loc of its possible interpretations (thus, Loc is the set of locations that are even partially covered by all the readers in ): For each oi, the trajectories compatible with Θi are the 16 combinations of the locations in Loc for τ ∈ [1.3]. However, we have the constraint that there is no direct access between the pairs l2, l3, and l2, l8, and l2, l9, and l3, l8, and l3, l9. Hence, among these 16 trajectories, only the following 5 can have been followed by oi: . In turn, jointly considering o1, o2, the interpretations for are the 25 combinations of these 5 trajectories (such as ⟨A, A⟩, ⟨A, B⟩, ⟨A, C⟩, etc.). Now, assume the constraint that l2 cannot contain multiple people simultaneously, and the same for l3. This entails, for instance, that no two objects can simultaneously follow ta and tb, otherwise they would be both in l2 at . Thus, among the 25 interpretations, the valid ones are: ; ; ; ; ; .
A possible way to detect and discard unrealistic interpretations of the readings is offered by the integrity constraints that describe the movements of objects in indoor spaces, such as:
- –
capacity constraints (CC), limiting the number of objects that can occupy the same position (such as: “no more than 6 people can be simultaneously in l0”). They are implied by the area of the locations, and/or by their typical usage (for instance, ATM rooms can be used by one person at a time);
- –
traveling-time constraints (TT), stating the min/max amount of time to reach a location from another one (such as: “10sec are required to reach l5 from l1”). They are typically implied by the speed of the objects and the indoor distances between locations (where the “indoor distance” takes into account walls and obstacles).
- –
latency constraints (LT), stating the minimum/maximum durations of the stay in a location. They take into account the physical inertia of objects, as well as the time required for accomplishing the tasks (if any) that the objects in the locations are due to perform.
These constraints cover different aspects (i.e., objects’ motility, presence of obstacles, size of the locations), and they have been already used (but never jointly) in the literature on reasoning over spatio-temporal Knowledge Bases, as done in Grant, Molinaro, and Parisi (2018); Grant, Parisi, Parker, and Subrahmanian (2010); Parisi and Grant (2016), and on target tracking. In particular, LT and TT constraints have been used for cleaning the sequences of readings generated by a single object moving in indoor spaces in Fazzinga, Flesca, Furfaro, and Parisi (2016) and in (Baba, Lu, Pedersen, Xie, 2014, Baba, Lu, Xie, Pedersen, 2013) (where they have been encoded in a graph structure, as discussed in Section 6). CC constraints have been used in Chen, Ku, Wang, and Sun (2010) for interpreting the RFID readings describing the positions of multiple still objects at a single time point, and in Zhao and Ng (2012) for interpreting the trajectories of multiple objects, but without considering the presence of walls and obstacles. The extension of our work to deal with other forms of constraints is discussed in Section 7.
Generally, even a large set of constraints encoding a deep knowledge of the domain may not suffice to locate the right interpretation. For instance, in Example 1, six interpretations are valid, thus leaving some uncertainty on the actual objects’ positions.
We deal with this uncertainty by addressing the interpretation of the readings from a probabilistic standpoint: we aim at providing a probability distribution function (pdf) that, based on the constraints, assigns to every list of trajectories a probability of being the right interpretation for . In particular, we devise a Metropolis Hastings (MH) sampler (see Doucet, de Freitas, & Gordon, 2001 for a survey), that constructs as a multiset of samples. These samples are interpretations of satisfying the constraints and are generated in a chain: each sample is obtained by applying some perturbation mechanism on the previous sample of the chain (see Fig. 2, where the generic sample si consists in a candidate interpretation ).
Challenges and contribution. In general, devising an MH sampler means devising a proper perturbation mechanism, and what makes this challenging is the fact that the perturbation mechanism must be tailored at the characteristics and the semantics of the samples to be collected; otherwise, the sampler may be unable to fairly explore the sample space. This means that there is no perturbation mechanism that works in every context. In fact, our investigation starts by analyzing the perturbation mechanism proposed in Chen et al. (2010) to deal with the much simpler scenario that the monitored objects are still. We show that the strategy guiding the MH sampler of Chen et al. (2010) is inadequate for our much more complex scenario of moving objects. The reason of this undesired behavior is that the perturbation mechanism of Chen et al. (2010) is “impeded” by the presence of our complex integrity constraints: under LT, TT, and CC constraints several moves over the sampling space are not allowed, since they are wrongly regarded as yielding invalid samples. This makes the sampler inaccurate, as it cannot fairly explore the space of valid interpretations. Starting from this observation, we devise a new MH sampler with the following amenities:
- –
it uses a novel perturbation mechanism tailored at the challenging aspects introduced by the movement of multiple objects in the presence of capacity, traveling-time, and latency constraints;
- –
it allows constraints to be specified with both a hard and a weak semantics. On the one hand, constraints marked as “hard” can be used to encode mandatory conditions, and they are taken into account by forbidding the collection of samples inconsistent with them (meaning that is assigned zero over every inconsistent with any hard constraint). On the other hand, constraints marked as “weak” can be used to specify “desiderata” (i.e., conditions whose fulfillment is recommended, though not mandatory) and they are taken into account as follows. The number of violations raised by an interpretation is translated into a factor that lowers the probability that is picked by the sampler. This way, the more the violations of weak-constraints occurring in the lower the estimate of the target probability over .
The effectiveness and efficiency of our technique have been experimentally assessed in scenarios with different characteristics (in terms of readers’ deployment and trajectories’ shapes). In this analysis, the effects of varying the “depth” of the perturbations by tuning a perturbation factor (which can alter the degree of similarity between a sample and its perturbation) have been investigated, and several state-of-the-art techniques have been compared with.
Relevance of the contribution. The major novelty of our research is an effective adaption of the MH paradigm to the multi-target tracking problem in indoor spaces. In fact, although MH has been a popular sampling paradigm in several research fields for decades, sampling methods based on other Monte Carlo variants—mainly particle filtering (see Doucet et al. (2001); Liao, Fox, Hightower, Kautz, and Schulz (2003); Singh, Kumar, Madhow, Suri, and Cagley (2011); Song and Wang (2014); Tran et al. (2009); Vo et al. (2015); Xie, Yang, Chen, Wang, and Yu (2008); Yu, Ku, Sun, and Lu (2013); Zhao and Ng (2012))—have been preferred in the context of target tracking. The most notable exception is Chen et al. (2010), where, however, MH is used to estimate the positions of still objects and only CC constraints are used. Our work shows that: i. with a non-trivial effort, MH can be extended to be simultaneously guided by TT, LT, and CC constraints, that naturally describe complementary aspects of the indoor setting; ii. this extended MH is significantly more effective than several state-of-the art approaches not based on sampling, as well as than embedding the same constraints into a state-of-the-art particle filtering technique.
Interestingly, the simultaneous use of TT, LT and CC constraints is related to a further aspect of novelty of this work. In fact, in the literature, there is the common assumption that tracking multiple distinguished targets (like our RFID tagged objects) can be merely addressed by resorting to multiple instances of the single target tracking problem. Indeed, researches dealing with the multi-tracking problem typically focus on indistinguishable targets, and on mechanisms for discerning the single targets or for monitoring them as groups (see Section 6). In this regard, our research shows that the case of multiple distinguishable targets can be solved by exploiting some peculiar aspects that cannot be found in the single target tracking problem: the estimation of the trajectories can be enhanced by leveraging constraints on the behavior of single objects (i.e., LT and TT constraints) jointly with constraints that would have no effect in solving the single tracking problem (i.e., CC constraints).
Plan of the paper. This article is organized as follows. In Section 2, we introduce the notions and notations used in the rest of the work. In Section 3, we review the MH paradigm and describe MH’s naive application to our interpretation problem (which means applying the variant of MH proposed in Chen et al. (2010) to our scenario). Then, in Section 4, we introduce our proposal, by first explaining the limits of the naive approach, and then presenting our solutions for overcoming these limits and for enlarging the scope of the technique (making it capable of dealing with weak constraints). In the remaining sections, we present our experimental results (Section 5) and discuss the related work (Section 6). Finally, in Section 7, we draw our conclusions by discussing the applicability of our proposal to technologies other than RIFD and the limitations of the proposed framework, and we will explain how these limitations will be used as starting points of future work.
Section snippets
Preliminaries
We denote the monitored objects as the monitoring time interval as the set of readers as rm}, and the set of locations into which the map is partitioned as . Basically, a location is a portion of the map that, in the analyst’s perspective, has to be distinguished from the others. For instance, in the case of buildings hosting offices or apartments, locations are rooms (“Restroom”, “Office”, etc.) or portions of “large” rooms (“Hallway - North”, “Hallway - South”,
Before our approach: MH and its naive application to RFID data
The Metropolis Hastings algorithm (MH) approximates a target pdf whose exact formulation is not known, by exploiting the knowledge of a function (called proposal distribution) that is directly computable and proportional to . MH generates a sequence S of samples having this property: the number of occurrences of a sample in S is proportional to and, thus, to . Hence, viewed as a histogram, S has the same “shape” as .
MH (Algorithm 1) generates, at each step,
Our approach: a new MH sampler
Our approach will be introduced gradually. First, we focus on hard constraints, and show that the combination of our constraints with the pointwise perturbation makes the naive sampler discussed in Section 3.1 ineffective. Then, we introduce a new perturbation mechanism (called blockwise), and explain how it overcomes this limit. Finally, we consider the weak constraints, and discuss some ways for tuning the sampling process and make it more effective.
Datasets
Real data. We considered a real-life data-set (namely, real) containing the readings collected every sec for 20 people simultaneously moving for about 20 min (i.e., 1200 time points) in a furnished apartment (real contains 24 000 readings). The apartment has an area of about 60m2 and consists of 3 rooms and 1 corridor, each equipped with an RFID antenna with detection range m. No limits on people’s movements were imposed.
Large scale synthetic data. We considered synthetic datasets
Related work
The problem of interpreting RFID readings has been investigated in the literature at two abstraction levels. At a low abstraction level, it has been stated as the problem of cleaning the readings, i.e., replacing each reading with the indication of the reader that best represents the position of the object oi at τ. In some sense, this can be viewed as considering locations indistinguishable from readers. We will refer to the techniques belonging to this family as cleaning techniques.
Conclusions and future work
We have tackled the problem of interpreting RFID tracking data, an important and challenging step of the pre-processing phase of trajectory-mining frameworks (such as Feng, Zhu, 2016, Zheng, 2015). In particular, a sampling technique for interpreting RFID data has been introduced, where a sequence of readings generated by a set of objects that simultaneously moved for a time interval over a map is interpreted by providing a pdf over the candidate sets of trajectories that may have generated the
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Title of the paper: Interpreting RFID tracking data for simultaneously moving objects: an offline sampling-based approach Authors: Bettina Fazzinga, Sergio Flesca, Filippo Furfaro, Francesco Parisi
CRediT authorship contribution statement
Bettina Fazzinga: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Sergio Flesca: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Filippo Furfaro: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Francesco
References (65)
- et al.
On collaborative tracking of a target group using binary proximity sensors
Journal of Parallel and Distributed Computing
(2010) - et al.
Managing RFID Data
Proc. int. conf. on very large data bases (vldb), toronto, canada
(2004) - et al.
Probabilistic spatio-temporal knowledge bases: capacity constraints, count queries, and consistency checking
International Journal of Approximate Reasoning
(2018) - et al.
An AGM-style belief revision mechanism for probabilistic spatio-temporal logics
Artificial Intelligence
(2010) - et al.
An approach to process continuous location-dependent queries on moving objects with support for location granules
Journal of Systems and Software
(2011) - et al.
Orientation-aware RFID tracking with centimeter-level accuracy
Proc. int. conf. on information processing in sensor networks (ipsn), porto, portugal
(2018) - et al.
Automatic detection of false positive RFID readings using machine learning algorithms
Expert Systems with Applications
(2018) - et al.
Overview of bayesian sequential monte carlo methods for group and extended object tracking
Digital Signal Processing
(2014) - et al.
Comparative survey of indoor positioning technologies, techniques, and algorithms
Proc. int. conf. on cyberworlds (cw), santander, spain
(2014) - et al.
Coverage-based placement in RFID networks: An overview
Proc. int. conf. on mobile, ubiquitous, and intelligent computing (music), vancouver, canada
(2012)
Learning-based cleansing for indoor RFID data
Proc. int. conf. on management of data (sigmod), san francisco, ca, usa
Handling false negatives in indoor RFID data
Int. conf. on mobile data management (mdm), brisbane, australia
Spatiotemporal data cleansing for indoor RFID tracking data
Int. conf. on mobile data management (mdm), milan, italy
RFID data processing with a data stream query language
Proc. int. conf. on data engineering (icde), istanbul, turkey
Leveraging spatio-temporal redundancy for RFID data cleansing
Proc. int. conf. on management of data (sigmod), indianapolis, indiana, usa
Inferring mobile trajectories using a network of binary proximity sensors
Proc. annual IEEE communications society conference on sensor, mesh and ad hoc communications and networks (secon), salt lake city, ut, USA
Sequential monte carlo methods in practice
Rfid-data compression for supporting aggregate queries
ACM Transactions on Database Systems
Cleaning trajectory data of RFID-monitored objects through conditioning under integrity constraints
Proc. int. conf. on extending database technology (edbt), athens, greece
Offline cleaning of RFID trajectory data
Int. conf. on scientific and statistical database management (ssdbm), aalborg, denmark
Exploiting integrity constraints for cleaning trajectories of RFID-monitored objects
ACM Transactions on Database Systems
Efficient and effective RFID data warehousing
Ptrack: A RFID-based tracking algorithm for indoor randomly moving targets
Proc. int. conf. on smart computing and communication, shenzhen, china
A survey on trajectory data mining: Techniques and applications
IEEE Access
Indoor performance analysis of LF-RFID based positioning system: Comparison with UHF-RFID and UWB
Proc. int. conf. on indoor positioning and indoor navigation (ipin), sapporo, japan
Modeling massive RFID data sets: A gateway-based movement graph approach
IEEE Transactions on Knowledge and Data Engineering
Warehousing and Analyzing Massive RFID Data Sets
Proc. int. conf. on data engineering (icde), atlanta, ga, USA
Reasoning about RFID-tracked moving objects in symbolic indoor spaces
Proc. int. conf. on scientific and statistical database management (ssdbm), baltimore, md, usa
Cited by (12)
Research progress of rfid data cleaning technology
2022, Journal of Frontiers of Computer Science and TechnologyRecursive SQL and GPU-support for in-database machine learning
2022, Distributed and Parallel DatabasesIndoor Trajectory Prediction for Shopping Mall via Sequential Similarity
2022, Information (Switzerland)ITAR: A Method for Indoor RFID Trajectory Automatic Recovery
2022, Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST