skip to main content
research-article

Data mining of maps and their automatic region-time-theme classification

Published: 01 March 2009 Publication History

Abstract

The goal of this research is to organize maps mined from journal articles into categories for hierarchical browsing within region, time and theme facets. A 150-map training set collected manually was used to develop classifiers. Metadata pertinent to the maps were harvested and then run separately though knowledge sources and our classifiers for region, time and theme. Evaluation of the system based on a 54-map test set of unseen maps showed 69%--93% classification accuracy when compared with two human classifications for the same maps. Data mining and semantic analysis methods used here could support systems that index other types of article components such as diagrams or charts by region, time and theme.

References

[1]
Amitay, E., Har'El, N., Sivan, R., & Soffer, A. 2004. Web-a-Where: Geotagging web content. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Sheffield, United Kingdom, July 25--29, 2004), 273--280. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1008992.1009040
[2]
Clough, P. 2005. Extracting metadata for spatially-aware information retrieval on the internet. Proceedings of the 2005 workshop on Geographic Information Retrieval, November 5, 2005, Bremen, Germany, 25--30 DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1096985.1096992
[3]
Entlich, R., Olsen, J., Garson, L., Lesk, M., Normore, L. and Weibel, S. 1997. Making a Digital Library: the contents of the CORE project ACM Trans. on Info Systems, vol. 15, 103--123 (April 1997) DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/248625.248627
[4]
Gabrilovich, E. and Markovitch, S. 2007. Harnessing the expertise of 70,000 human editors: Knowledge-based feature generation for text categorization. Journal of Machine Learning Research 8 (Oct. 2007), 2297--2345.
[5]
Gelernter, J. 2008. MapSearch: A protocol and prototype application to find maps. Doctoral Thesis. UMI Order Number: UMI Order No. pending, Rutgers University.
[6]
Gelernter, J. and Lesk, M. 2008. Creating a searchable map library via data mining. Proceedings of the 2008 Conference on Digital Libraries (Pittsburgh, Pennsylvania, June 16--20, 2008). Joint Conference on Digital Libraries. ACM Press: New York, NY DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1378889.1378997
[7]
Gelernter, J. and Lesk, M. 2008. Is your map here? A well-designed interface will help you answer this question quickly. The 17th International Research Symposium on Computer-based Cartography, September 8--11, 2008, Shepherdstown, West Virginia, USA.
[8]
Golub, K. 2006. The role of different thesauri terms and captions in automated subject classification. Proceedings of the ICCC/WIC/ACM International Conference on Web Intelligence, 961--965 DOI= 10.1109/WI.2006.169
[9]
Graco, W., Semenova, T. and Dubossarsky, E. 2007. Toward knowledge-driven data mining. SIGKDD Workshop on Domain Driven Data Mining (San Jose, California, USA, August 12, 2007) 49--54. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1288 552.1288559
[10]
Jenkins, C. and Inman, D. 2000. Adaptive automatic classification on the web. Proceedings of the 11th International Workshop on Database and Expert System Applications (September 4--8, 2000), 504--511 DOI=10.1109/DEXA.2000.875074
[11]
Ke, H., and Shaoping, M. 2006. Text categorization based on concept indexing and principal component analysis. Proceedings of IEEE TENCON '02 51- TENCON '02. IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering Volume 1, 28--31 (Oct. 2002), 51--56 vol. 1 no DOI.
[12]
Kim, P. and Myaeng, S. H. 2004. Usefulness of temporal information automatically extracted from news articles for topic tracking. ACM Transactions on Asian Language Information Processing 3, 4 (December 2004), 227--242. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1039621.1039624
[13]
Larson, R. R. 1992. Experiments in automatic Library of Congress Classification. Journal of the American Society for Information Science 43, 2 (1992), 130--148. DOI=10.1002/(SICI)1097-4571(199203)43:2<130::AID-ASI3>3.0.CO;2--S
[14]
Leidner, J. L. 2007. Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. Unpublished doctoral dissertation, University of Edinburgh, United Kingdom. Retrieved January 8, 2008 from http://hdl.handle.net/1842/1849
[15]
Lesk, M., Egan, D., Ketchum, D., Lochbaum, C. 1992. Better Things for Better Chemistry Through Multi-media. Proc. 8th Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research, Waterloo, Ont. 1992.
[16]
Maxwell, C., Leaney, J., O'Neill, T. 2008. Utilising abstract matching to preserve the nature of heuristics in design optimization. 15th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems, March 31 2008-April 4 2008, 287--296. DOI=10.1109/ECBS.2008.29
[17]
Oberhauser, O. 2005. Automatisches Klassifizieren: Entwicklungsstand -- Methodik -- Anwendungsbereich. Europäische Hochschulschriften. Series XLI Informatik, vol. 43. Frankfurt am Main: Peter Land.
[18]
Perry, M., Hakimpour, F. and Sheth, A. 2006. Analyzing theme, space, and time: An ontology-based approach. Proceedings of the 14th ACM International Symposium on Geographic Information Systems (Arlington, Virginia, November 10--11, 2006) ACM-GIS 2006, 147--154. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/1183471.1183496
[19]
Petras, V., Larson, R. R. and Buckland, M. 2006. Time period directories: A metadata infrastructure for placing events in temporal and geographic context. Proceedings of the 6th ACM/IEEE CS Joint Conference on Digital Libraries (Chapel Hill, North Carolina, June 11--15, 2006), 151--160. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/11417 53.1141782
[20]
Prabowo, R., Jackson, M., Burden, P. and Knoell, H.-D. 2002. Ontology-based automatic classification for web pages: design, implementation and evaluation. Proceedings of the 3rd International Conference on Web Information Systems Engineering, 182--191.
[21]
Sandusky, R. J. and Tenopir, C. 2008. Finding and using journal-article components: Impacts of disaggregation on teaching and research practice. Journal of the American Society for Information Science and Technology 59, 6 (April 2008), 970--982. DOI=10.1002/asi.20804
[22]
Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys 34: 1--47. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/50528 2.505283
[23]
Wang, C., Xie, X., Wang, L., Lu, Y. and Ma, W-Y. 2005. Detecting geographic locations from web resources. In Proceedings of the 2005 Workshop on Geographic Information Retrieval (Bremen, Germany, November 4, 2005). GIR '05. ACM Press, New York, NY, 17--24. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/10969 85.1096991
[24]
Wang, Y., Hodges, J. and Tang, B. 2003. Classification of web documents using a naïve bayes method. Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence (November 3--5, 2003), 560--564.
[25]
Wu, V., Manmatha, R., Riseman, E. 1997. Finding Text in Images. Proceedings of the 2nd International Conference on Digital Libraries, July, Philadelphia, Pennsylvania, USA, 3--12. DOI=http://doi.acm.org.proxy.libraries.rutgers.edu/10.1145/263690.263766

Cited By

View all
  • (2010)A location based text mining method using ANN for geospatial KDD processProceedings of the 7th international conference on Advances in Neural Networks - Volume Part II10.1007/978-3-642-13318-3_37(292-301)Online publication date: 6-Jun-2010
  • (2009)Classification of raster maps for automatic feature extractionProceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/1653771.1653793(138-147)Online publication date: 4-Nov-2009
  • (2009)A Location Based Text Mining Approach for Geospatial Data MiningProceedings of the 2009 Fourth International Conference on Innovative Computing, Information and Control10.1109/ICICIC.2009.23(1172-1175)Online publication date: 7-Dec-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image SIGSPATIAL Special
SIGSPATIAL Special  Volume 1, Issue 1
March 2009
49 pages
EISSN:1946-7729
DOI:10.1145/1517463
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2009
Published in SIGSPATIAL Volume 1, Issue 1

Check for updates

Author Tags

  1. algorithms
  2. classifiers
  3. faceted classification
  4. geographic information retrieval
  5. geospatial data
  6. indexing
  7. knowledge extraction
  8. metadata harvesting
  9. text mining

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2010)A location based text mining method using ANN for geospatial KDD processProceedings of the 7th international conference on Advances in Neural Networks - Volume Part II10.1007/978-3-642-13318-3_37(292-301)Online publication date: 6-Jun-2010
  • (2009)Classification of raster maps for automatic feature extractionProceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/1653771.1653793(138-147)Online publication date: 4-Nov-2009
  • (2009)A Location Based Text Mining Approach for Geospatial Data MiningProceedings of the 2009 Fourth International Conference on Innovative Computing, Information and Control10.1109/ICICIC.2009.23(1172-1175)Online publication date: 7-Dec-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media