Elsevier

Big Data Research

Volume 5, September 2016, Pages 16-21
Big Data Research

ELAN: An Efficient Location-Aware Analytics System

https://doi.org/10.1016/j.bdr.2016.08.001Get rights and content

Abstract

We demonstrate an Efficient Location-Aware aNalytics system (ELAN), aiming to provide users with location-aware data analytics services. For each user-selected spatial region, ELAN can instantly identify the most important functionality features of the region (e.g., business zones and residential areas) by efficiently analyzing the user-generated content (UGC) within the region. For each feature, ELAN can efficiently calculate the spatial boundary of the functional zone (denoted by a convex hull) in order to help users better understand the feature and furthermore we can identify the influential range of a certain feature. ELAN has many real-world applications, e.g., choosing business locations and popular regions discovery. There are two main challenges in designing a location-aware data analytics system. The first is to achieve high performance, as the region may contain a large amount of location-based UGC data. The second is to support continuous queries as users may continuously change the region by zooming in, zooming out, and panning the map. To address these challenges, we propose effective spatio-textual indexes and efficient incremental algorithms to support instant location-aware data analytics. We have implemented and deployed a system, which has been commonly used and widely accepted.

Introduction

Nowadays, the rapid growth of Mobile Internet has witnessed the widespread use of location-aware services, e.g., Google map search and Yelp local search. Meanwhile, with the popularity of location-based social networks (LBSN), e.g., Foursquare and Qzone, users are generating more and more user-generated content (UGC) with location information, e.g., check-ins and reviews at venues. Obviously the location-based UGC data plays an important role in location-based services, and we can take advantage of them to provide users with various location-aware analytics services. Although traditional applications can improve the functionality of the online map, such as region-based keyword queries [1] and finding places with specific category within user selected regions [2], they cannot fully support online location-aware data analytics. Obviously we can instantly identify the most important functionality features of users' interested region, e.g., business zones and residential areas, to help users better understand the selected region, by efficiently analyzing the UGC data. To this end, in this paper we demonstrate an Efficient Location-Aware aNalytics system (called ELAN), which can instantly analyze location-based UGC data.1

Fig. 1 shows a screenshot of our ELAN system. Given a user-selected spatial region, ELAN efficiently analyzes the location-based UGC data in this region and shows the important functionality features (e.g., attorney and restaurants) to the user using a word-cloud-style interface. For each feature, e.g., attorney, the system calculates the spatial boundary of the functional zone to the feature, as illustrated in Fig. 1. ELAN provides a new strategy to discover the map – it is not only a simple tool to find location-based information, but also a way to explore the aggregated features of different regions on the map. ELAN has many real-world applications. (1) Map Navigation: Suppose a user is visiting a strange place and wants to know the most important functionalities in the region. Obviously it is difficult for her to get such information using traditional map applications. Alternatively, ELAN can show the functionalities on the map, and the user can browse and compare different regions, and choose the most interesting region. (2) Business Location Selection: If a businessman wants to open a new supermarket in a region, she can use our ELAN system to browse the map and check whether there are competitive supermarkets in the region and whether there are potential customers close to the region. (3) Public Facility Planning: ELAN can also provide suggestions for the government to do public facility planning. Suppose the government wants to build a sport center. The government can use ELAN to find the regions that are short of sport centers and have a large population.

It is rather challenging to design an efficient location-aware analytics system because (1) the user-selected region can be rather large and there may be a large amount of location-aware UGC data in the region; and (2) the user may continuously change the region (e.g., zooming in, zooming out, or panning the map) to make a comparison between different regions to find which region is more interesting. To address these two challenges, we propose effective index structures and efficient incremental algorithms to support instant location-aware analytics. We extend R-Tree [3] by incorporating textual information into the R-Tree nodes and propose KR-Tree. Compared with existing spatio-textual indexes, the objective of KR-Tree is to support location-aware analytics instead of location-based search, and KR-Tree only maintains some statistics spatio-textual information and has much smaller index sizes than existing spatio-textual indexes. When analyzing the features of each region, we calculate the importance of the features using the KR-Tree index instead of extracting all UGC data in the region, which can filter a large amount of irrelevant data. To support continuous queries, we propose efficient incremental algorithms that avoid many unnecessary computations. To demonstrate the effectiveness of our algorithms, we compare with the baseline algorithms, and the experimental results show that our method significantly outperforms the baseline approaches.

We have implemented and deployed a system based on some real-world datasets, which has been commonly used and widely accepted.

Section snippets

System overview

In this section, we first define the location-aware analytics query and a related problem – functional region analyzing, and then introduce the system architecture.

Basic algorithm

Given a user-selected region R, the basic algorithm first searches for all the leaf nodes in the R-Tree within the region, then scans the keywords of all locations in the leaf nodes, and sorts the keywords by the word score (Equation (1)) and finally returns top-k keywords.Score=TFwinRTFallwordsinRlogCountallPOIsCountPOIscontainingw Equation (1) is similar to the term frequency and inverse document frequency (TF–IDF), but we treat the region as a document in the TF part, and treat the

Experiment

We used millions of POIs in Beijing (1,228,736) and USA (9,700,676) to evaluate our methods, and each POI contains the point's latitude, longitude, address, category, description, and UGC. We conducted a user study to evaluate the effectiveness of ELAN and an experiment to verify the efficiency of our basic, improved, and incremental algorithms.

Effectiveness. To evaluate the effectiveness of ELAN, we first analyze the keywords of a particular region. We select a rectangular region

Applications

In this section, we discuss the applications of our system.

Business location selection. Businesses will regularly investigate detailedly to select the location for their new business, and traditional methods waste a lot of manpower and financial resources. For instance, assume a businessman wants to open a new bicycle store. Existing location-based systems cannot help her to find a good location. Alternatively, she can use our ELAN system. She first selects several regions and compares the

Conclusion

In this demonstration paper, we demonstrated an efficient location-aware analytics system, which has many real-world map applications. We detailedly described the structure of the index, the improved algorithm, and the incremental algorithm. Our algorithm is not only efficient, but also can exactly extract keywords which can well reflect the characters of a region. We proposed the functional zone visualization algorithm, which can automatically discovery important features and identify the

References (8)

  • R.L. Graham et al.

    Finding the convex hull of a simple polygon

    J. Algorithms

    (1983)
  • R. Zhong et al.

    Location-aware instant search

  • J. Yuan et al.

    Discovering regions of different functions in a city using human mobility and pois

  • N. Beckmann et al.

    The r-tree: an efficient and robust access method for points and rectangles

There are more references available in the full text version of this article.

Cited by (10)

  • Developing a data analytics platform to support decision making in emergency and security management

    2019, Expert Systems with Applications
    Citation Excerpt :

    In the case of medical incidents, the knowledge about their distribution can help to efficiently schedule resource allocation across the different municipalities and study the optimal location of medical centres close to urban areas and rural villages with higher frequencies of these types of emergencies. Among others, see the works of Huang, Kim, and Menezes (2010), Liu et al. (2016), Sakr and Elgammal (2016), Aringhieri, Carello, and Morale (2013) and Fogue et al. (2016). These articles helped to design the analysis of incidents according to the geographical distribution by municipalities, which provides to the emergency and security services with the information required to improve these processes.

  • Geospatial analytics for COVID-19 active case detection

    2021, Computers, Materials and Continua
  • Guidance in the Visual Analytics of Cartographic Images in the Decision-Making Process

    2020, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
  • Location-based social network data for tourism destinations

    2019, Big Data and Innovation in Tourism, Travel, and Hospitality: Managerial Approaches, Techniques, and Applications
View all citing articles on Scopus

This article belongs to Analytics and Applications.

View full text