Interpreting RFID tracking data for simultaneously moving objects: An offline sampling-based approach

https://doi.org/10.1016/j.eswa.2020.113368Get rights and content

Highlights

  • The problem of interpreting RFID tracking data is addressed.

  • Collected readings are mapped to the sequences of semantical locations.

  • A sampling technique for interpreting RFID data is introduced.

  • A novel MH sampler guided by integrity constraints is proposed.

  • A thorough experimental analysis is provided.

Abstract

We consider the scenario of multiple RFID-tagged objects that simultaneously move across an indoor space where several RFID antennas are placed. We assume that a logical partition of the indoor space into a set of locations is given, along with a set of hard and weak integrity constraints describing both the valid movements of the objects and the capacity of the locations. In this setting, we address the problem of matching the collected readings to the trajectories(namely, the sequences of locations) followed by the target objects. We model this problem as estimating a probability distribution function over the possible matchings of the readings to the locations. The core of our approach is a novel Metropolis Hastings sampler that is guided by the integrity constraints to distinguish between likely and unlikely ways of interpreting the readings. The challenges of integrating the constraints into the sampler are discussed, and a thorough experimental analysis, where the proposed approach is compared with the state of the art, is provided.

Introduction

A possible way of exploiting the RFID technology for tracking moving objects is the so called tag tracking paradigm: the target objects are equipped with a tag (emitting radio signals encoding identifying information), while the locations of the premises where the objects move are equipped with readers (whose antennas can detect the tags’ signals). Then, given a set of objects o=o1,,on simultaneously monitored over a time interval I=[1.T], the result of the tracking is a set Θ=Θ1,,Θn, where each Θi is the sequence of readings collected for oi. That is, Θi=R1i,,RTi, where each Rτi is the (possibly empty) set of readers that detected oi at time point τ.

Currently, RFID technology is one of the most used infrastructure-based solutions adopted in indoor tracking systems. The reason of its popularity is its versatility: once the infrastructure is installed, RFID technology can be used both for: 1) monitoring objects, such as products on supermarkets’ shelves, medical devices and tools in hospitals, and fixed assets (e.g., laptops, smart devices, books, furniture, labs equipment) in every kind of organizations; 2) monitoring people, such as customers in supermarkets/malls, patients and personnel in hospitals, employees and visitors in organizations. In fact, there are several scenarios (like that of a mall, that we will consider in our experiments) where the RFID infrastructure is first installed to monitor fixed assets and shelves, and then exploited also to track people. Once the readers are installed (possibly for different purposes), tracking people becomes easy and cheap: it suffices to equip each person with a passive tag, whose cost does not exceed a few dollar cents. This versatility and adaptability is not shared by other technologies: for instance, smartphone-based tracking can be used for people, but is not suitable for objects. This explains the interest of research and industry in investigating RFID based solutions for people tracking: GAO RFID and Flexiray are examples of companies providing RFID systems for people and personnel tracking. The interest in RFID-based tracking systems is also certified by the recent MarketsandMarkets analysis report in Indoor Location Market (2019), where the market of indoor positioning (of which RFID-based solutions constitute a large portion) has been estimated to become $41 billion worth by 2022. This has led the research to explicitly address the management of the data collected by indoor RFID tracking systems, as done in Bai, Wang, Liu, Zaniolo, and Liu (2007); Chawathe, Krishnamurthy, Ramachandran, and Sarma (2004); Fazzinga, Flesca, Furfaro, and Masciari (2013); Fazzinga, Flesca, Masciari, and Furfaro (2009); Gonzalez et al. (2010); Gonzalez, Han, Li, and Klabjan (2006); Lee and Chung (2008), and in particular the issue of interpreting these data.

The interpretation problem. The set of readings Θ generally admits different interpretations, i.e., ways of being matched to the locations of the map. Formally, an interpretation for Θ is a hypothesis on where the objects were during I, i.e., it is a list t=t1,,tn, where each ti is a sequence of locations L1i,,LTi representing a trajectory for oi. The point is that we cannot determine each Lτi (the position of oi at τ) only on the basis of Rτi (the readers that detected oi at τ). In fact, a one-to-one correspondence between locations and readers is infrequent: the same location may contain zones covered by different readers, and the same reader may cover different locations. Moreover, false negatives may occur: an object close to a reader may not be detected, owing to malfunctions. For instance, in Fig. 1, an object in the zone of l0 covered by r0 and r4 may be detected by both r0 and r4, or by only one of them, or by none.

However, not all the possible interpretations of Θ are “realistic”: some trajectories, though compatible with the readings, cannot have been simultaneously followed by the objects. Example 1 shows how this can happen, and how integrity constraints can help detect the unrealistic interpretations.

Example 1

Consider Fig. 1 and two people o=o1,o2 monitored over I=[1s..3s]. The collected readings are: Θ=Θ1,Θ2, where Θ1=R11,R21,R31, Θ2=R12,R22,R32. Assume that, for each τ ∈ I, Rτ1=Rτ2 (i.e., o1, o2 were detected by the same readers at each time point). Each Rτi is reported below, along with the set Loc(Rτi) of its possible interpretations (thus, Loc(Rτi) is the set of locations that are even partially covered by all the readers in Rτi):

τ123Rτ1=Rτ2{r2,r3}{r6}{r2,r3}Loc(Rτ1)=Loc(Rτ2){l2,l3}{l3,l4,l8,l9}{l2,l3}

For each oi, the trajectories compatible with Θi are the 16 combinations of the locations in Loc(Rτi) for τ ∈ [1.3]. However, we have the constraint that there is no direct access between the pairs l2, l3, and l2, l8, and l2, l9, and l3, l8, and l3, l9. Hence, among these 16 trajectories, only the following 5 can have been followed by oi:

A=l2l4l2, B=l2l4l3, C=l3l3l3, D=l3l4l2, E=l3l4l3.

In turn, jointly considering o1, o2, the interpretations for Θ are the 25 combinations of these 5 trajectories (such as ⟨A, A⟩, ⟨A, B⟩, ⟨A, C⟩, etc.). Now, assume the constraint that l2 cannot contain multiple people simultaneously, and the same for l3. This entails, for instance, that no two objects can simultaneously follow ta and tb, otherwise they would be both in l2 at τ=1. Thus, among the 25 interpretations, the valid ones are: ti=A,C; tii=A,E; tiii=B,D; tiv=C,A; tv=D,B; tvi=E,A.

A possible way to detect and discard unrealistic interpretations of the readings is offered by the integrity constraints that describe the movements of objects in indoor spaces, such as:

  • capacity constraints (CC), limiting the number of objects that can occupy the same position (such as: “no more than 6 people can be simultaneously in l0”). They are implied by the area of the locations, and/or by their typical usage (for instance, ATM rooms can be used by one person at a time);

  • traveling-time constraints (TT), stating the min/max amount of time to reach a location from another one (such as: “10sec are required to reach l5 from l1”). They are typically implied by the speed of the objects and the indoor distances between locations (where the “indoor distance” takes into account walls and obstacles).

  • latency constraints (LT), stating the minimum/maximum durations of the stay in a location. They take into account the physical inertia of objects, as well as the time required for accomplishing the tasks (if any) that the objects in the locations are due to perform.

These constraints cover different aspects (i.e., objects’ motility, presence of obstacles, size of the locations), and they have been already used (but never jointly) in the literature on reasoning over spatio-temporal Knowledge Bases, as done in  Grant, Molinaro, and Parisi (2018); Grant, Parisi, Parker, and Subrahmanian (2010); Parisi and Grant (2016), and on target tracking. In particular, LT and TT constraints have been used for cleaning the sequences of readings generated by a single object moving in indoor spaces in Fazzinga, Flesca, Furfaro, and Parisi (2016) and in (Baba, Lu, Pedersen, Xie, 2014, Baba, Lu, Xie, Pedersen, 2013) (where they have been encoded in a graph structure, as discussed in Section 6). CC constraints have been used in Chen, Ku, Wang, and Sun (2010) for interpreting the RFID readings describing the positions of multiple still objects at a single time point, and in Zhao and Ng (2012) for interpreting the trajectories of multiple objects, but without considering the presence of walls and obstacles. The extension of our work to deal with other forms of constraints is discussed in Section 7.

Generally, even a large set of constraints encoding a deep knowledge of the domain may not suffice to locate the right interpretation. For instance, in Example 1, six interpretations are valid, thus leaving some uncertainty on the actual objects’ positions.

We deal with this uncertainty by addressing the interpretation of the readings from a probabilistic standpoint: we aim at providing a probability distribution function (pdf) p(t|Θ) that, based on the constraints, assigns to every list of trajectories t a probability of being the right interpretation for Θ. In particular, we devise a Metropolis Hastings (MH) sampler (see  Doucet, de Freitas, & Gordon, 2001 for a survey), that constructs p(t|Θ) as a multiset of samples. These samples are interpretations of Θ satisfying the constraints and are generated in a chain: each sample is obtained by applying some perturbation mechanism on the previous sample of the chain (see Fig. 2, where the generic sample si consists in a candidate interpretation t).

Challenges and contribution. In general, devising an MH sampler means devising a proper perturbation mechanism, and what makes this challenging is the fact that the perturbation mechanism must be tailored at the characteristics and the semantics of the samples to be collected; otherwise, the sampler may be unable to fairly explore the sample space. This means that there is no perturbation mechanism that works in every context. In fact, our investigation starts by analyzing the perturbation mechanism proposed in Chen et al. (2010) to deal with the much simpler scenario that the monitored objects are still. We show that the strategy guiding the MH sampler of Chen et al. (2010) is inadequate for our much more complex scenario of moving objects. The reason of this undesired behavior is that the perturbation mechanism of Chen et al. (2010) is “impeded” by the presence of our complex integrity constraints: under LT, TT, and CC constraints several moves over the sampling space are not allowed, since they are wrongly regarded as yielding invalid samples. This makes the sampler inaccurate, as it cannot fairly explore the space of valid interpretations. Starting from this observation, we devise a new MH sampler with the following amenities:

  • it uses a novel perturbation mechanism tailored at the challenging aspects introduced by the movement of multiple objects in the presence of capacity, traveling-time, and latency constraints;

  • it allows constraints to be specified with both a hard and a weak semantics. On the one hand, constraints marked as “hard” can be used to encode mandatory conditions, and they are taken into account by forbidding the collection of samples inconsistent with them (meaning that p(t|Θ) is assigned zero over every t inconsistent with any hard constraint). On the other hand, constraints marked as “weak” can be used to specify “desiderata” (i.e., conditions whose fulfillment is recommended, though not mandatory) and they are taken into account as follows. The number of violations raised by an interpretation t is translated into a factor that lowers the probability that t is picked by the sampler. This way, the more the violations of weak-constraints occurring in t, the lower the estimate of the target probability over t.

The effectiveness and efficiency of our technique have been experimentally assessed in scenarios with different characteristics (in terms of readers’ deployment and trajectories’ shapes). In this analysis, the effects of varying the “depth” of the perturbations by tuning a perturbation factor (which can alter the degree of similarity between a sample and its perturbation) have been investigated, and several state-of-the-art techniques have been compared with.

Relevance of the contribution. The major novelty of our research is an effective adaption of the MH paradigm to the multi-target tracking problem in indoor spaces. In fact, although MH has been a popular sampling paradigm in several research fields for decades, sampling methods based on other Monte Carlo variants—mainly particle filtering (see  Doucet et al. (2001); Liao, Fox, Hightower, Kautz, and Schulz (2003); Singh, Kumar, Madhow, Suri, and Cagley (2011); Song and Wang (2014); Tran et al. (2009); Vo et al. (2015); Xie, Yang, Chen, Wang, and Yu (2008); Yu, Ku, Sun, and Lu (2013); Zhao and Ng (2012))—have been preferred in the context of target tracking. The most notable exception is Chen et al. (2010), where, however, MH is used to estimate the positions of still objects and only CC constraints are used. Our work shows that: i. with a non-trivial effort, MH can be extended to be simultaneously guided by TT, LT, and CC constraints, that naturally describe complementary aspects of the indoor setting; ii. this extended MH is significantly more effective than several state-of-the art approaches not based on sampling, as well as than embedding the same constraints into a state-of-the-art particle filtering technique.

Interestingly, the simultaneous use of TT, LT and CC constraints is related to a further aspect of novelty of this work. In fact, in the literature, there is the common assumption that tracking multiple distinguished targets (like our RFID tagged objects) can be merely addressed by resorting to multiple instances of the single target tracking problem. Indeed, researches dealing with the multi-tracking problem typically focus on indistinguishable targets, and on mechanisms for discerning the single targets or for monitoring them as groups (see Section 6). In this regard, our research shows that the case of multiple distinguishable targets can be solved by exploiting some peculiar aspects that cannot be found in the single target tracking problem: the estimation of the trajectories can be enhanced by leveraging constraints on the behavior of single objects (i.e., LT and TT constraints) jointly with constraints that would have no effect in solving the single tracking problem (i.e., CC constraints).

Plan of the paper. This article is organized as follows. In Section 2, we introduce the notions and notations used in the rest of the work. In Section 3, we review the MH paradigm and describe MH’s naive application to our interpretation problem (which means applying the variant of MH proposed in Chen et al. (2010) to our scenario). Then, in Section 4, we introduce our proposal, by first explaining the limits of the naive approach, and then presenting our solutions for overcoming these limits and for enlarging the scope of the technique (making it capable of dealing with weak constraints). In the remaining sections, we present our experimental results (Section 5) and discuss the related work (Section 6). Finally, in Section 7, we draw our conclusions by discussing the applicability of our proposal to technologies other than RIFD and the limitations of the proposed framework, and we will explain how these limitations will be used as starting points of future work.

Section snippets

Preliminaries

We denote the monitored objects as o=o1,,on, the monitoring time interval as I=[1.T], the set of readers as R={r1,, rm}, and the set of locations into which the map is partitioned as L. Basically, a location is a portion of the map that, in the analyst’s perspective, has to be distinguished from the others. For instance, in the case of buildings hosting offices or apartments, locations are rooms (“Restroom”, “Office”, etc.) or portions of “large” rooms (“Hallway - North”, “Hallway - South”,

Before our approach: MH and its naive application to RFID data

The Metropolis Hastings algorithm (MH) approximates a target pdf F(x), whose exact formulation is not known, by exploiting the knowledge of a function P(x) (called proposal distribution) that is directly computable and proportional to F(x). MH generates a sequence S of samples having this property: the number of occurrences of a sample s in S is proportional to P(s), and, thus, to F(s). Hence, viewed as a histogram, S has the same “shape” as F(x).

MH (Algorithm 1) generates, at each step,

Our approach: a new MH sampler

Our approach will be introduced gradually. First, we focus on hard constraints, and show that the combination of our constraints with the pointwise perturbation makes the naive sampler discussed in Section 3.1 ineffective. Then, we introduce a new perturbation mechanism (called blockwise), and explain how it overcomes this limit. Finally, we consider the weak constraints, and discuss some ways for tuning the sampling process and make it more effective.

Datasets

Real data. We considered a real-life data-set (namely, real) containing the readings collected every Δ=1sec for 20 people simultaneously moving for about 20 min (i.e., 1200 time points) in a furnished apartment (real contains 24 000 readings). The apartment has an area of about 60m2 and consists of 3 rooms and 1 corridor, each equipped with an RFID antenna with detection range ρ=2.5m. No limits on people’s movements were imposed.

Large scale synthetic data. We considered synthetic datasets

Related work

The problem of interpreting RFID readings has been investigated in the literature at two abstraction levels. At a low abstraction level, it has been stated as the problem of cleaning the readings, i.e., replacing each reading Riτ with the indication of the reader that best represents the position of the object oi at τ. In some sense, this can be viewed as considering locations indistinguishable from readers. We will refer to the techniques belonging to this family as cleaning techniques.

Conclusions and future work

We have tackled the problem of interpreting RFID tracking data, an important and challenging step of the pre-processing phase of trajectory-mining frameworks (such as Feng, Zhu, 2016, Zheng, 2015). In particular, a sampling technique for interpreting RFID data has been introduced, where a sequence of readings generated by a set of objects that simultaneously moved for a time interval over a map is interpreted by providing a pdf over the candidate sets of trajectories that may have generated the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Title of the paper: Interpreting RFID tracking data for simultaneously moving objects: an offline sampling-based approach Authors: Bettina Fazzinga, Sergio Flesca, Filippo Furfaro, Francesco Parisi

CRediT authorship contribution statement

Bettina Fazzinga: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Sergio Flesca: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Filippo Furfaro: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing - original draft, Writing - review & editing. Francesco

References (65)

  • A.I. Baba et al.

    Learning-based cleansing for indoor RFID data

    Proc. int. conf. on management of data (sigmod), san francisco, ca, usa

    (2016)
  • A.I. Baba et al.

    Handling false negatives in indoor RFID data

    Int. conf. on mobile data management (mdm), brisbane, australia

    (2014)
  • A.I. Baba et al.

    Spatiotemporal data cleansing for indoor RFID tracking data

    Int. conf. on mobile data management (mdm), milan, italy

    (2013)
  • Y. Bai et al.

    RFID data processing with a data stream query language

    Proc. int. conf. on data engineering (icde), istanbul, turkey

    (2007)
  • H. Chen et al.

    Leveraging spatio-temporal redundancy for RFID data cleansing

    Proc. int. conf. on management of data (sigmod), indianapolis, indiana, usa

    (2010)
  • E. Cho et al.

    Inferring mobile trajectories using a network of binary proximity sensors

    Proc. annual IEEE communications society conference on sensor, mesh and ad hoc communications and networks (secon), salt lake city, ut, USA

    (2011)
  • A. Doucet et al.

    Sequential monte carlo methods in practice

    (2001)
  • B. Fazzinga et al.

    Rfid-data compression for supporting aggregate queries

    ACM Transactions on Database Systems

    (2013)
  • B. Fazzinga et al.

    Cleaning trajectory data of RFID-monitored objects through conditioning under integrity constraints

    Proc. int. conf. on extending database technology (edbt), athens, greece

    (2014)
  • B. Fazzinga et al.

    Offline cleaning of RFID trajectory data

    Int. conf. on scientific and statistical database management (ssdbm), aalborg, denmark

    (2014)
  • B. Fazzinga et al.

    Exploiting integrity constraints for cleaning trajectories of RFID-monitored objects

    ACM Transactions on Database Systems

    (2016)
  • B. Fazzinga et al.

    Efficient and effective RFID data warehousing

  • G. Feng et al.

    Ptrack: A RFID-based tracking algorithm for indoor randomly moving targets

    Proc. int. conf. on smart computing and communication, shenzhen, china

    (2016)
  • Z. Feng et al.

    A survey on trajectory data mining: Techniques and applications

    IEEE Access

    (2016)
  • V. Gharat et al.

    Indoor performance analysis of LF-RFID based positioning system: Comparison with UHF-RFID and UWB

    Proc. int. conf. on indoor positioning and indoor navigation (ipin), sapporo, japan

    (2017)
  • GiPStech (2019). www.gipstech.com. Accessed 14 October...
  • H. Gonzalez et al.

    Modeling massive RFID data sets: A gateway-based movement graph approach

    IEEE Transactions on Knowledge and Data Engineering

    (2010)
  • H. Gonzalez et al.

    Warehousing and Analyzing Massive RFID Data Sets

    Proc. int. conf. on data engineering (icde), atlanta, ga, USA

    (2006)
  • S.H. Hussein et al.

    Reasoning about RFID-tracked moving objects in symbolic indoor spaces

    Proc. int. conf. on scientific and statistical database management (ssdbm), baltimore, md, usa

    (2013)
  • Indoor Location Market (2019). https://www.marketsandmarkets.com/Market-Reports/indoor-location-market-989.html....
  • IndoorAtlas (2019). www.indooratlas.com. Accessed 14 October...
  • indoo.rs (2019). www.indoo.rs. Accessed 14 October...
  • View full text