Application of Large-Scale Classification Techniques for Simple Location Estimation Experiments

Lei, Howard; Choi, Jaeyoung; Friedland, Gerald

doi:10.1007/978-3-319-09861-6_6

Howard Lei^3,4,
Jaeyoung Choi⁴ &
Gerald Friedland⁴

Abstract

This chapter describes an application using established classification techniques for performing simple multimodal location estimation experiments. It demonstrates the use of Gaussian Mixture Model (GMM)—and language model-based approaches for verifying the cities from which Flickr videos are taken based on the videos’ audio and textual metadata. The methods used in most of the approaches are described in detail, allowing people with no background in location estimation to perform simple experiments. The city-verification results for the approaches are not eye-popping by any means, but are above-random and present opportunities for future work in the development of better approaches. The techniques may also be suitable for class projects, for students who wish to gain hands-on experience in performing location estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gaussian Mixture Trees for One Class Classification in Automated Visual Inspection

Model-Based Clustering

Article Open access 01 October 2016

Clustering Spatial Data via Mixture Models with Dynamic Weights

References

H. Lei, J. Choi, G. Friedland, Multimodal city-verification on Flickr videos using acoustic and textual features, in Proceedings of ICASSP, Kyoto, Japan, (2012)
Google Scholar
D.A. Reynolds, T.F. Quatieri, R. Dunn, Speaker Verification using Adapted Gaussian Mixture Models. Digit. Signal Process. 10, 19–41 (2000)
Article Google Scholar
S. Davis, P. Mermelstein, Comparison of Parametric Representations of Monosyllabic Word Recognition in Continuously Spoken Sentences, in Proceedings of ICASSP (1980)
Google Scholar
D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models, in IEEE Transactions on Speech and Audio Process, vol. 3, pp. 72–83 (1995)
Google Scholar
J. Pelecanos, S. Sridharan, Feature Warping for Robust Speaker Verification, in Speaker Odyssey: The Speaker Recognition Workshop, Crete, Greece, (2001)
Google Scholar
W. Campbell, D. Sturim, D. Reynolds, Support Vector Machines using GMM Supervectors for Speaker Verification. IEEE Signal Process. Lett. 13, 308–311 (2006)
Article Google Scholar
J.F. Bonastre, F. Wils, S. Meignier, ALIZE, a free Toolkit for Speaker Recognition, in ICASSP, vol. 1, pp. 737–740 (2005)
Google Scholar
HMM Toolkit (HTK), http://htk.eng.cam.ac.uk
T. Joachims, Making Large Scale SVM Learning Practical, in Advances in Kernel Methods—Support Vector Learning, ed. by B. Schoelkopf, C. Burges, A. Smola (MIT-press, Cambridge, 1999)
Google Scholar
A. Stolcke, SRILM—An Extensible Language Modeling Toolkit in Proceedings of the International Conference Spoken Language Processing, Denver, Colorado, (2002)
Google Scholar
G. Schindler, M. Brown, R. Szeliski, City-scale Location Recognition, in IEEE Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
W. Zhang, J. Kosecka, Image based Localization in Urban Environments in 3rd International Symposium on 3D Data Processing, Visualization, and Transmission (2006)
Google Scholar
J. Hays, A. Efros, IM2GPS: Estimating Geographic Information from a Single Image, in IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
N. Jacobs, S. Satkin, N. Roman, R. Speyer, R. Pless, Geolocation Static Cameras, in IEEE International Conference on Computer Vision (2007)
Google Scholar
A. Rae, V. Murdock, P. Serdyukov, P. Kelm, Working Notes for the Placing Task at MediaEval 2011, in Proceedings of MediaEval (2011)
Google Scholar
P. Kelm, S. Schmiedeke, J. Choi, G. Friedland, V. Ekambaram, K. Ramchandran, T. Sikora, A Novel Fusion Method for Integrating Multiple Modalities and Knowledge for Multimodal Location Estimation, in GeoMM’13, Barcelona, Spain, (2013)
Google Scholar
J. Choi, H. Lei, V. Ekambaram, P. Kelm, L. Gottlieb, T. Sikora, K. Ramchandran, G. Friedland, Human vs Machine: Establishing a Human Baseline for Multimodal Location Estimation, in ACM SIGMM International Conference on Multimedia (2013)
Google Scholar
P. Ipeirotis, Analyzing the Amazon Mechanical Turk Marketplace, in ACM XRDS (Crossroads), vol. 17, No. 2, (2010)
Google Scholar
MediaEval Web Site, http://www.multimediaeval.org
R.P. Lippmann, L.C. Kukolich, E. Singer, LNKnet: Neural Network, Machine Learning, and Statistical Software for Pattern Classification. Linc. Lab. J. 6, 249–268 (1993)
Google Scholar

Download references

Acknowledgments

The experiments described in this work were supported by NGA NURI grant number HM11582-10-1-0008, NSF EAGER grant IIS-1138599, and NSF Award CNS-1065240. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the sponsors.

Author information

Authors and Affiliations

International Computer Science Institute, Berkeley, CA, USA
Howard Lei
California State University, East Bay, CA, USA
Howard Lei, Jaeyoung Choi & Gerald Friedland

Authors

Howard Lei
View author publications
You can also search for this author in PubMed Google Scholar
Jaeyoung Choi
View author publications
You can also search for this author in PubMed Google Scholar
Gerald Friedland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Howard Lei .

Editor information

Editors and Affiliations

International Computer Science Institute, Berkeley, California, USA
Jaeyoung Choi
International Computer Science Institute, Berkeley, California, USA
Gerald Friedland

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lei, H., Choi, J., Friedland, G. (2015). Application of Large-Scale Classification Techniques for Simple Location Estimation Experiments. In: Choi, J., Friedland, G. (eds) Multimodal Location Estimation of Videos and Images. Springer, Cham. https://doi.org/10.1007/978-3-319-09861-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-09861-6_6
Published: 05 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09860-9
Online ISBN: 978-3-319-09861-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Application of Large-Scale Classification Techniques for Simple Location Estimation Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gaussian Mixture Trees for One Class Classification in Automated Visual Inspection

Model-Based Clustering

Clustering Spatial Data via Mixture Models with Dynamic Weights

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Application of Large-Scale Classification Techniques for Simple Location Estimation Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Gaussian Mixture Trees for One Class Classification in Automated Visual Inspection

Model-Based Clustering

Clustering Spatial Data via Mixture Models with Dynamic Weights

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation