Keywords

1 Introduction

The transportation planning evaluates the current transportation situation and predicts the future travel demand, and the effect of the facility investment accordingly. The transportation policy diagnoses the current transportation situation, solves the problems, and efficiently uses the given resources to provide the public convenience. In this process, both microscopic travel pattern and precise demand prediction and accuracy of analysis are required.

Recent developments and changes in science and technology, such as the 4th Industrial Revolution, AI, IoT, advanced ICT, ITS, and Big Data, have given a lot of implications to transportation planners. With the development of data collection technology and the accumulation of various big data, transportation planning is still sticking to the traditional four-step transportation demand estimation method. This is a method of sequential application of four independent steps of trip generation, trip distribution, mode choice, and trip assignment, which are widely used due to easy understanding and logical analysis. However, it is premised on the aggregate spatial unit (city district or administrative district) called Traffic Analysis Zone and time unit (daily traffic volume). As a result, there are limitations in analysis of various effects for analyzing the population and social characteristics of individuals. It does not reflect the interrelationships between the steps, and travel behavior from transport policies. In addition, the temporal analysis unit is O/D demand aggregated in daily units, which neglects the direction of travel, and does not consider the trip chain [1]. It is difficult to analyze changes in travel patterns and changes in land use and population characteristics.

On the other hand, in the activity-based model, which has been actively studied recently [25], it is assumed that travel demand is not generated for the purpose of “trip” itself but that trip is derived from each individual’s “activity” do. In the activity-based model, it is possible to use a microscopic analysis unit rather than an aggregated analysis unit of the existing 4-step demand forecasting technique [6]. It can utilize the socioeconomic indicators of the more detailed county (about 1/25 of the size of the village), and the 24 h activity schedule reflecting the individual characteristics and the trip chain for the travel impact analysis, micro-transportation planning, and traffic management services.

The activity-based traveler analyzer uses a variety of data such as household travel data, mobile phone bigdata, socioeconomic data, and land use data.

2 Traveler Analyzer

2.1 Concept of Traveler Analyzer

The Activity-BAsed Traveler Analyzer (ABATA), based on the Big Data, is a system that estimates the activity population and the derived travel demand from the activities taken into account for individual schedules and activity categories (home, work, shopping, and leisure) with respect to land use types (Fig. 1).

The ABATA system estimates the existing population by time and space (aggregate district or administrative unit) of the analysis area based on the statistics of the National Statistical Office (NSO), the survey data on the actual condition of households, and other microscopic spatial data, then establish the O/D trips. Especially, the purpose of this study is to develop a system to analyze changes in travel behavior due to changes in socioeconomic population or land use and travel related policies. For example, analysis of changes in travel behavior and travel pattern when population composition ratio of a specific city changes due to aging, analysis of transportation system change due to transportation policy implementation such as time lag, analysis of travel behavior according to change of school hours, construction of department store, and analysis of change of transportation system and influence according to land use change.

The ABATA system utilizes a variety of basic data. We utilize survey data on the actual condition of households to establish individual activity schedules. The household travel survey is conducted every five years on a nationwide basis. Among the survey data, household status, characteristics of household members, and individual travel characteristics are important data for establishing a 24-h activity schedule. However, because the individual trips are recorded in the survey on the households, it is necessary to convert the trips into activity schedules. Household travel survey data have individual characteristics and travel information, but they are limited by small sample data (about 3% to 5%). Therefore we use a mobile phone data which represent all population engaged into a certain activity at study area. The mobile phone data defined by the number of people in each age group by 50 m × 50 m grid cells. The mobile phone data do not include individual data for personal information protection, however, it is a valuable data because it can identify the existent population by time and space.

Socioeconomic data of the National Statistical Office (total population by household composition, number of workers by industrial classification, etc.) are reliable for providing information on various socioeconomic populations of micro spaces although there is a limitation that they do not have information by time frame. In ABATA system, we utilize building association area data and student number data of Nice National Service.

Fig. 1.
figure 1

Concept of activity-based traveler analyzer.

2.2 Description of Mobile Phone Data

In this study, we used the SKT’s mobile phone data. The original mobile phone records were collected and preprocessed by SKT, one of three mobile phone telecommunication operators in Korea. First, every mobile phone signal is regularly received by a nearby cellular tower, and the existing location and time information is stored in a server. The SKT consistently held about half of the country’s total mobile phone memberships. Before providing the dataset, SKT further expanded their mobile phone records to include the total mobile phone users in Korea, using the country’s market share rate. This is a huge advantage for the population and mobility analysis because the dataset represents the total population of mobile phone users.

The daily records consist of 16 columns (Table 1). The data include information on the number of users at each cell grouped by age and gender of users, the coordinates of the cell location, and the user’s home location. It should be noted that it is a daily-based records, and double counts are not allowed in the same cell in the same day. The data do not store a person logs in order to protect privacy. For this study, we analyzed two weeks of data in two years (March 16–22, 2015 and March 14–20, 2016). The dataset include about 160 million records per day on average.

Table 1. Structure of daily mobile phone data.

3 System Structure

3.1 Total Activity Population

The ABATA system is first to calculate total activity population based on the mobile phone data recorded on the study area. The mobile phone data provide the real number of people presenting in study area at each time. The existing population is a population that exists in a specific space without regard to the purpose of activity, and the active population means the population that is performing specific activity. Since the mobile phone data provides total existing population by time, if data can be secured, it can be used directly to calculate activity population. If it is not available, however, the statistics of the National Statistical Office (NSO) are used to estimate the total activity population in the ABATA system.

3.2 Construct Activity Schedule

The ABATA system construct individual activity schedule by using household travel survey data. The seven activity categories (home, work, shopping, leisure, school, education, and others) are defined to construct each activity profile. To do this, the all trips data of households are converted into individual activity schedules in 10-min increments, and an activity schedule and an activity profile for each hour are developed. The activity profile represents the composition ratio of the active population per activity purpose. Figure 2 is an example of a comparison of activity profiles when the elderly population ratios increased from 6% to 30%. The hourly total activity population is combined with activity profile, then the each activity population by hourly is calculated.

Fig. 2.
figure 2

Comparison of activity profiles between the proportions of the older people.

3.3 Develop Activity Attractiveness

The hourly activity population needs to combine with land use data. Since a certain activity is occurred at a certain place or area. For example, the shopping activity has to be occurred at market or department store. The land use data easily obtained from the statistics of the National Statistical Office (NSO). The land use type and job categories connect to activity type. Based on various data, the multiple regression models are constructed and estimated to activity attractiveness. From results, the activity attractiveness for each block is estimated based on total activity attractiveness in study area. Then the ratio of individual block against the whole study area by each activity represents the attractiveness for the activity type.

3.4 Estimate Activity Population

The ABATA system calculates the activity population and distribute it onto each space and hourly manner (Fig. 3). To develop O/D, first calculate the amount of travel and the destination. Some of the people who are doing a certain activity at the present time can make a trip for the next activity. In order to find out this, we extract the probabilities (conditional travel probabilities by activity type and group) from the household travel data by time and activity, and then apply the probability of trip occurrence to the time, activity, then calculate the amount of travel generated per hour. The destination choice for the generated trip is decided from the spatiotemporal activity attraction. Finally, the O/D is developed for 24 h, 7 activities, and counties (Fig. 3). Based on the estimated O/D, the choice of model considering the difference in the choice of travel mode for each demographic characteristic, the ABATA system examines the explanatory variables (trip purpose, travel time, population characteristics (sex, age, occupation) and runs the decision tree model.

Fig. 3.
figure 3

Activity population and origin-destination flows.

3.5 Simulation Activity Population

ABATA system is currently developing for Gangnam in Seoul. The system has the ability to visualize the results of various scenarios. It is constructed to allow the user to select scenarios and compare the changed results, and it is configured to enable scenario-specific simulations (such as land use change, population composition ratio, policy change, etc.). Figure 4 shows the graphic maps of simulation results from the scenario of new skyscraper, Hyundai GBC (Global Business Center), construction. The figure shows that the comparison of the number of people in work activity by time of day. Then we can easily figure out where and when the activity population changes occurred.

Fig. 4.
figure 4

Activity population changes from the Hyundai Global Business Center Construction.

4 Conclusion

The Activity-BAsed Traveler Analyzer (ABATA), based on the Big Data is expected to be reliable system for providing reliable information regarding the changes of population, activity schedule, and land use. Therefore, it will help transportation decision makers out to introduce transport systems and make current systems efficient.

All human beings perform essential activities and generate movements to perform certain activities. Therefore, the developed passenger analysis system can be applied not only to traffic demand forecasting and transportation policy analysis but also to analysis of environment (micro dust exposure) impact analysis, location analysis, emergency evacuation plan, and tourist related system planning.