Data Association for Semantic World Modeling from Partial Views

Wong, Lawson L. S.; Kaelbling, Leslie Pack; Lozano-Pérez, Tomás

doi:10.1007/978-3-319-28872-7_25

Lawson L. S. Wong⁵,
Leslie Pack Kaelbling⁵ &
Tomás Lozano-Pérez⁵

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 114))

4959 Accesses
2 Citations

Abstract

Autonomous mobile-manipulation robots need to sense and interact with objects to accomplish high-level tasks such as preparing meals and searching for objects. To achieve such tasks, robots need semantic world models, defined as object-based representations of the world involving task-level attributes. In this work, we address the problem of estimating world models from semantic perception modules that provide noisy observations of attributes. Because attribute detections are sparse, ambiguous, and are aggregated across different viewpoints, it is unclear which attribute measurements are produced by the same object, so data association issues are prevalent. We present novel clustering-based approaches to this problem, which are more efficient and require less severe approximations compared to existing tracking-based approaches. These approaches are applied to data containing object type-and-pose detections from multiple viewpoints, and demonstrate comparable quality to the existing approach using a fraction of the computation time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Indices have been dropped to reduce clutter; please refer to two paragraphs above for indices.
2.
The correct Bayesian approach is to integrate over the posterior distribution of each light’s location, which is intractable. This can be approximated by sampling the locations, then averaging the subsequent computations. In practice we found that using the posterior mean was sufficient.
3.
For simplicity, we assume that the error covariance is axis-aligned and use an independent normal-gamma prior for each dimension, but it is straightforward to extend to general covariances.
4.
The typical interpretation of normal-gamma hyperparameters is that the mean is estimated from $\lambda $ observations with mean $\nu $, and the precision from $2 \alpha $ observations with mean $\nu $ and variance $\frac{\beta }{\alpha }$.

References

Anati, R., Scaramuzza, D., Derpanis, K., Daniilidis, K.: Robot localization using soft object detection. In: IEEE International Conference Robotics and Automation (2012)
Google Scholar
Antoniak, C.: Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Stat. 2(6), 1152–1174 (1974)
Article MathSciNet MATH Google Scholar
Atanasov, N., Sankaran, B., Ny, J.L., Koletschka, T., Pappas, G., Daniilidis, K.: Hypothesis testing framework for active object detection. In: International Conference Robotics and Automation (2013)
Google Scholar
Bar-Shalom, Y., Fortmann, T.: Tracking and Data Association. Academic Press, New York (1988)
MATH Google Scholar
Bernardo, J., Smith, A.: Bayesian Theory. John Wiley, New York (1994)
Book MATH Google Scholar
Cox, I., Hingorani, S.: An efficient implementation of Reid’s multiple hypothesis tracking algorithm and its evaluation for the purpose of visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 18(2), 138–150 (1996)
Google Scholar
Cox, I., Leonard, J.: Modeling a dynamic environment using a Bayesian multiple hypothesis approach. AI J. 66(2), 311–344 (1994)
MATH Google Scholar
Cox, I.J.: A review of statistical data association techniques for motion correspondence. Int. J. Comput. Vis. 10(1), 53–66 (1993)
Article Google Scholar
Dellaert, F., Seitz, S., Thorpe, C., Thrun, S.: EM, MCMC, and chain flipping for structure from motion with unknown correspondence. Mach. Learn. 50(1–2), 45–71 (2003)
Article MATH Google Scholar
Eidenberger, R., Scharinger, J.: Active perception and scene modeling by planning with probabilistic 6D object poses. In: IEEE/RSJ Intl. Conf. Intelligent Robots and Systems (2010)
Google Scholar
Elfring, J., van den Dries, S., van de Molengraft, M., Steinbuch, M.: Semantic world modeling using probabilistic multiple hypothesis anchoring. Robot. Auton. Syst. 61(2), 95–105 (2013)
Article Google Scholar
Glover, J., Popovic, S.: Bingham Procrustean alignment for object detection in clutter. In: IEEE/RSJ Intl. Conf. Intelligent Robots and Systems (2013)
Google Scholar
Hager, G., Wegbreit, B.: Scene parsing using a prior world model. Int. J. Robot. Res. 30(12), 1477–1507 (2011)
Article Google Scholar
Kurien, T.: Issues in the design of practical multitarget tracking algorithms. In: Y. Bar-Shalom (ed.) Multitarget-Multisensor Tracking: Advanced Applications, pp. 43–84. Artech House (1990)
Google Scholar
Neal, R.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)
MathSciNet Google Scholar
Oh, S., Russell, S., Sastry, S.: Markov chain Monte Carlo data association for multi-target tracking. IEEE Trans. Autom. Control 54(3), 481–497 (2009)
Article MathSciNet Google Scholar
Ranganathan, A., Dellaert, F.: Semantic modeling of places using objects. In: Robotics: Science and Systems (2007)
Google Scholar
Reid, D.: An algorithm for tracking multiple targets. IEEE Trans. Autom. Control 24(6), 843–854 (1979)
Article Google Scholar
Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work was supported in part by the NSF under Grant No. 1117325. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. We also gratefully acknowledge support from ONR MURI grant N00014-09-1-1051, from AFOSR grant FA2386-10-1-4135, and from the Singapore Ministry of Education under a grant to the Singapore-MIT International Design Center.

Author information

Authors and Affiliations

CSAIL, MIT, Cambridge, MA, 02139, USA
Lawson L. S. Wong, Leslie Pack Kaelbling & Tomás Lozano-Pérez

Authors

Lawson L. S. Wong
View author publications
You can also search for this author in PubMed Google Scholar
Leslie Pack Kaelbling
View author publications
You can also search for this author in PubMed Google Scholar
Tomás Lozano-Pérez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lawson L. S. Wong .

Editor information

Editors and Affiliations

Creative Informatics, The University of Tokyo, Tokyo, Japan
Masayuki Inaba
School of Electrical Engineering and Com, Queensland Univ of Technology, Brisbane, Queensland, Australia
Peter Corke

Appendix: Posterior and Predictive Distributions for a Single Light

In this appendix, we verify the claim from Sect. 2 that finding the posterior and predictive distributions on color and location for a single light is straightforward, given that we know which observations were generated by that light. Let $\left\{ (o, x) \right\} $ denote the set of light color-location detections that correspond to a light with unknown parameters (c, l). Color and location measurements are assumed to be independent given (c, l) and will be considered separately. We assume a known discrete prior distribution $\pi \in \varDelta ^{(T-1)}$ on colors, reflecting their relative prevalence. Using the color noise model (Eq. 1), the posterior and predictive distributions on c are:

$$\begin{aligned} \mathbb {P}\left( c \,\big |\, \left\{ o \right\} \right) \propto \left[ \prod _o \phi ^c_o \right] \times \pi _c ; \quad \mathbb {P}\left( o' \,\big |\, \left\{ o \right\} \right)&= \sum _{c=1}^T \mathbb {P}\left( o' \big | c \right) \; \mathbb {P}\left( c \big | \left\{ o \right\} \right) \nonumber \\&= \sum _{c=1}^T \phi ^c_{o'} \; \mathbb {P}\left( c \,\big |\, \left\{ o \right\} \right) . \end{aligned}$$

(15)

We can use this to find the light’s probability of detection:

$$\begin{aligned} p_\text {D} \triangleq 1 - \mathbb {P}\left( o'=0 \,\big |\, \left\{ o \right\} \right) = 1 - \sum _{c=1}^T \phi ^{c}_0 \; \mathbb {P}\left( c \,\big |\, \left\{ o \right\} \right) . \end{aligned}$$

(16)

Unlike the constant false positive rate $p_{\text {FP}}$, the detection (and false negative) rate is dependent on the light’s color posterior.

For location measurements, we emphasize that both the mean $\mu $ and precision $\tau = \frac{1}{\sigma ^2}$ of the Gaussian noise model is unknown. Modeling the variance as unknown allows us to attain a better representation of the location estimate’s empirical uncertainty, and not naïvely assume that repeated measurements give a known fixed reduction in uncertainty each time. We use a standard conjugate prior, the distribution $\text {NormalGamma} (\mu , \tau ; \lambda , \nu , \alpha , \beta )$.^{Footnote 4} It is well known (e.g., [5]) that after observing n observations with sample mean $\hat{\mu }$ and sample variance $\hat{s}^2$, the posterior is a normal-gamma distribution with parameters:

$$\begin{aligned} \lambda '&= \lambda + n ; \; \nu ' = \frac{\lambda }{\lambda +n} \nu + \frac{n}{\lambda +n} \hat{\mu } ; \; \alpha ' = \alpha + \frac{n}{2} ; \; \beta ' \nonumber \\&= \beta + \frac{1}{2} \left( n\hat{s}^2 + \frac{\lambda n}{\lambda +n} \left( \hat{\mu } - \nu \right) ^2 \right) . \end{aligned}$$

(17)

The upshot of using a conjugate prior for location measurements is that the marginal likelihood of location observations has a closed-form expression. The posterior predictive distribution for the next location observation $x'$ is obtained by integrating out the latent parameters $\mu , \tau $, and has the following expression:

$$\begin{aligned} \mathbb {P}\left( x' \,\big |\, \left\{ x \right\} \,;\, \lambda , \nu , \alpha , \beta \right)&= \int _{(\mu , \tau )} \mathbb {P}\left( x \,\big |\, \mu , \tau \right) \mathbb {P}\left( \mu , \tau \,\big |\, \left\{ x \right\} \, \right) \nonumber \\&= \frac{1}{\sqrt{2 \pi }} \frac{{\beta '}^{\alpha '}}{{\beta ^+}^{\alpha ^+}} \frac{\sqrt{\lambda '}}{\sqrt{\lambda ^+}} \frac{\varGamma (\alpha ^+)}{\varGamma (\alpha ')} , \end{aligned}$$

(18)

where the hyperparameters with ‘$'$’ superscripts are updated according to Eq. 17 using the empirical statistics of $\left\{ x \right\} $ only (excluding $x'$), and the ones with ‘$+$’ superscripts are likewise updated but including $x'$. The ratio in Eq. 18 assesses the fit of $x'$ with the existing observations $\left\{ x \right\} $ associated with the light.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wong, L.L.S., Kaelbling, L.P., Lozano-Pérez, T. (2016). Data Association for Semantic World Modeling from Partial Views. In: Inaba, M., Corke, P. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 114. Springer, Cham. https://doi.org/10.1007/978-3-319-28872-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-28872-7_25
Published: 23 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28870-3
Online ISBN: 978-3-319-28872-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Data Association for Semantic World Modeling from Partial Views

Abstract

Access this chapter

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Posterior and Predictive Distributions for a Single Light

Appendix: Posterior and Predictive Distributions for a Single Light

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation