Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

A street crime threatens safety and security our daily life. It is a major concern for every local government to prevent citizens from getting involved in the crimes. There are many incidents reported every day, including murder, robbery, snatching, groping, and observation of dangerous animals. In order for citizens to protect themselves, it is important to understand when, where, and what kind of incidents can occur around living areas.

There have been social studies that investigate relevant factors of street crimes. For instance, Hipp and Kane showed that cities with more population will experience larger increases in crime [2]. They also mentioned that cities with increasing population will experience larger decreases in crime.

Although their findings are interesting, the investigation is basically conducted at the magnification of city or ward level. Thus, the information is too coarse-grained for citizens to know the situation of individual living area.

In recent years, due to the emerging trend of smart cities, motivated local governments are publishing fine-grained demographic data and information of street crimes as open data. Kobe, our local city, is one of such motivated cities in Japan.

The goal of this paper is to conduct micro-level analysis of street crimes in Kobe, using such fine-grained open data. Specifically, at a microscopic range of town blocks, we aim to answer the following three research questions:

  • RQ1: Does a town block with increasing population experience the decrease of street crimes?

  • RQ2: Does a town block with more population experience the increase of street crimes?

  • RQ3: How are occurrences of street crimes distributed and related among micro town blocks within the city?

In order to answer RQ1 and RQ2, we extensively use demographic open data published by Kobe city. The data include fine-grained population data of every ward, every town in the ward, every town block in the town, each of which is grouped by gender and age. Therefore, we can view the population in the region from various viewpoints. In addition, since the demographic open data is updated every quarter of the year, we can investigate the temporal transition of the population.

However, the original demographic data consists of a huge number of data items over multiple tables, which causes artificial complexity in retrieving necessary data. Therefore, we develop API, called Kobe Demographics API, which allows external tools and applications to easily access the large and complex demographic data. Using the API, one can efficiently query demographic data at various viewpoints and granularity, such as period, location, gender, and age.

As for the information of street crimes, we use data published by Hyogo Bouhan Net, which is a security information service operated by Hyogo prefectural Police. The service delivers recognized incident information within Hyogo prefecture (i.e., prefecture where Kobe city exists) by Web or e-mail.

In our previous research, we developed a security information service, called PRISM [3], which visualizes street crimes on a personalized security map. In developing PRISM, we implemented incident API to query the incident information from Hyogo Bouhan Net. Using the incident API, one can retrieve incident information by keywords, type, time, location, distance from central coordinates.

Combining the proposed Kobe demographics API, and the previous incident API, we can easily integrate fine-grained demographic data and incident information. Thus, we can conduct the micro-level analysis of population and street crimes to answer RQ1 and RQ2.

To address RQ3, we develop a new tool that visualizes fine-grained statistics of incidents on a map. The tool divides the whole region into multiple small square areas and calculates the frequency of street crimes in each area. The frequency is represented by a color of ten kinds, according to the number of crimes occurred within the area. The higher the frequency is, the deeper red is displayed. The lower the frequency is, the darker green is displayed. By using the tool, it is possible to see the distribution over micro town blocks, to know where street crimes frequently occur and its surroundings.

Analyzing the street crimes from a micro viewpoint, a local government can take precise policy actions based on the trends of individual town blocks. A police department can also benefit the micro-level analysis for efficient patrols within the area. Compared the conventional coarse-grained statistics, the fine-grained incident information give much more reality and impact to citizens, because the information of neighboring towns or town blocks directly influence to their daily life. For example, in areas where groping and exposure frequently occurs, a woman must be careful when she walks alone. Thus, we believe that the proposed micro-level analysis significantly contributes to citizens to achieve security and safety by themselves.

2 Preliminaries

2.1 Demographic Open Data of Kobe City

Open data refers to data that anyone can freely use and share without the permission of copyright. Recently, various governments and municipal bodies have published their data as open data. They aim to improve the quality of life of people, as well as the performance of business activities of enterprises [6].

Kobe city is operating a portal site of open data, where the city is disclosing various municipal data such as population, information of facilities, and time table of subway [5]. Among the various open data, demographic open data is quite interesting and informative. The data contains fine-grained population data for every town block within Kobe city, which is much more detailed than the magnitude of town or ward. Moreover, the population is grouped by gender and age for each ward, town, town block. Kobe city updates these data every quarter of the year. Thus, we can also see the transition of the fine-grained population for every quarter.

Although the open data is promising for digital innovation, it is not yet understandable by machines. The current major formats of open data are CSV (Comma-Separated Value), XLS format (Microsoft Excel), or PDF (Portable Document Format), which do not define machine-readable semantics for data items. Hence, to retrieve necessary data items, an application developer has to understand data structure and semantics, and implement proprietary program for parsing the data. This causes the artificial complexity in using open data, which is a barrier for private companies and general citizens to create new applications and services.

2.2 Security Information Service PRISM

In Japan, many local governments provide security information service for citizens. The service usually distributes information of crimes and incidents to residents using the Internet. Residents can make use of the information for avoiding crimes. For example, Hyogo Prefectural Police provides “Hyogo Bouhan Net”, which distributes incident information recognized by the Police on the Web and e-mail. In these existing security information services, however, every incident information is uniformly delivered to all users. Hence, when much information is delivered in a day, a user may miss important information.

In our previous research, we have proposed a new security information service, called PRISM (Personalized Real-time Information with Security Map), which personalizes the incident information based on living area of individual users. PRISM calculates the severity of each incident based on user’s living area, current time, and type of incident. Then, PRISM visualizes weighted incidents on a heat map. Thus, a personalized and real-time security map is generated.

We used incident information provided by Hyogo Bouhan Net in PRISM. PRISM periodically obtains incident information from Hyogo Bouhan Net. It then analyzes the text written in natural language, structures the text into attributes, and inserts the attributes into the RDB. We also developed Web-API in order to access the DB. External applications can easily retrieve incident data using the API.

2.3 Related Work

Many studies have been conducted to clarify factors relevant to street crimes.

Hipp and Kane focused on the relationship between population and crime [2]. They state that cities with more population will experience larger increases in crime. They also state that cities with increasing population will experience larger decreases in crime. Morenoff and Sampson [7] investigated the relationship between the decrease of population and the violent crime. They also visualized changes in the population. Roncek [8] statistically examined how the characteristics of a region affect places of crimes, using actual data of San Diego and Cleveland. Stults and Hasbrouck [9] focused on the relationship between commuting and crime rates. They showed that a city with many workers commuting from outside tends to have higher crime rate, since the workers often become targets of a crime. Kester [4] analyzed and visualized the patterns of crimes, using Formal Concept Analysis.

Although the above findings are quite interesting, they mainly focus on interesting regions as a whole. In other word, the investigation is basically conducted at the magnification of city or ward level. Thus, the information is too coarse-grained for citizens to know the situation of individual living area.

As for the efficient use of open data, some motivated governments start to provide API (Application Program Interface) for developers. For instance, Japanese Cabinet Office is operating RESAS (Regional Economy Society Analyzing System) [1]. It aggregates and visualizes big data including industrial structure, demographic data, and flow of people. Since 2016, RESAS starts to provide API, with which anyone can easily access various data in RESAS. However, the API is not specifically designed for demographic data. It just returns the whole data of population. Also, the demographic data provided by RESAS is limited to the scale of city, and not grouped by age or gender. This is much coarser than the demographic open data of Kobe city.

3 Kobe Demographics API

3.1 Motivation

As we mentioned in Sect. 2.1, Kobe city is publishing demographic open data. To understand the data, we here explain the following terms, constructing the structure of Kobe city.

  • City: a whole city of Kobe.

  • Ward: a division of a city. There are 9 wards in Kobe city.

  • Town: a populated area in a ward. Each ward consists of many towns.

  • Town Block: an area of a town, usually surrounded by streets. Each town consists of many town blocks. A town block forms the basic unit of urban fabric in Kobe city, where several houses or building are builtFootnote 1.

The demographic open data contains population for every town block within the city. Due to the above structure of the city, we can obtain different granularity of population at town, ward, and city levels. The population is also grouped by gender and ages.

The original demographic data is provided in the XLS format with a number of cells over multiple sheets. Therefore, when using the data for various purposes, we face with the problem of artificial complexity. To cope with the problem, we here develop Kobe Demographics API, which makes it easy to access the demographic data. In the following subsections, we will explain the flow of implementation of the API.

3.2 Inserting Data into Database

In order to develop the API, we first insert demographic data into a relational database (RDB).

To manage the demographic data, we create a database with two tables: population, and town. The population table is defined by the following data schema:

[Population]

  • blockCode: a code identifying a town block

  • year: year of census

  • month: month of census

  • household: the number of household

  • male: the population of men

  • female: the population of women

  • total: the total population in the block

  • 0: population of 0-year-old persons

  • 1: population of 1-year-old persons

  • 2: population of 2-year-old persons

  • ... : (columns for x-year-old persons)

  • 99 population of 99-year-old persons

  • 100 population of over-100-year-old persons

The town table manages information of town blocks, including address, town, and ward. In addition to them, for each town block, we associate latitude and longitude of the block. This is for convenience for external applications. The latitude and longitude are calculated from the address by using geographic information Web services. The town table is defined by the following data schema:

[Town]

  • blockCode: a code identifying a town block

  • address: the address of the town block

  • wardName: the ward which the town block belongs to

  • blockName: the name of town block

  • wardCode: a code of the ward which the town block belongs to

  • longitude: longitude of the town block

  • latitude: latitude of the town block

We have used MySQL to implement the database. We also developed a set of Perl scripts, which automatically parse the original XLS files of demographic data, retrieve data items and structures, and insert the data items into the above tables of the database.

3.3 Developing Kobe Demographics API

We developed two API to access the demographic data stored in the databases.

The first API is getPopulation(), which returns fine-grained population of a designated town block with a set of parameters.

getPopulation(blockCode, wardCode, year, month, gender, from, to)

parameters:

  • blockCode: specifies a town block to query

  • wardCode: specifies a ward to aggregate the population by ward

  • year: specifies a year of census

  • month: specifies a month of census

  • gender: specifies a gender to aggregate the population by gender

  • from, to: specifies a range of age to aggregate the population by age

For example, when one likes to retrieve population from 0 to 10 years old, in the town with blockCode 101001001, in June 2017, executing getPopulation(“101001000”, -, 2017, 6, -, 0, 10) returns the desired result. The resultant JSON is as follows:

figure a

By changing these parameters, one can retrieve demographic data at various viewpoints and granularity such as period, location, gender, and age.

The API getPopulation() requires a block code as a parameter. However, it is not easy for a general user to specify the block code directly. Hence, we implemented searchTown(), which is the second API.

searchTown(address)

Parameters:

  • address: specifies an address to query

This API returns a town block (or a set of town blocks involved) for the designated address. For a given address, searchTown() searches data items in the town table where the address contains the address. It then returns the information in JSON format. For example, when one likes to retrieve block code of “1-chome Uozaki Kita-machi”, executing searchTown(“1-chome Uozaki Kita-machi”) returns the desired result. The resultant JSON is as follows:

figure b

The above two API are deployed as Web service, so that applications running on various languages and platforms can easily consume the data.

3.4 Implementation

Techniques used in developing Kobe Demographics API are as follows:

  • Demographic Data: Kobe Open Data - Basic Resident Register of Kobe city (Population per town blocks, per age)Footnote 2

  • Data Parsing Script: Perl, Yahoo! GeoCoder API

  • Database: MySQL

  • API: Java JDK8, Jersey.

4 Micro-level Analysis of Street Crimes

To address the research questions introduced in Sect. 1, in this section, we investigate the relation between population and street crimes using the developed Kobe Demographics API.

4.1 Analyzing Relationship Between Population Change and Street Crime in Town Blocks

In order to answer RQ1, we here analyze the relationship between street crimes and the change in population for every town block in Kobe city. The procedure of the analysis consists of the following steps:

  • Step1: Calculate security index of each town block

  • Step2: Look up the increase and decrease of street crimes within each town block between two years

  • Step3: Calculate population growth rate of each town block for two years

  • Step4: Analyze the relation between the increase and decrease of crimes and the population growth rate.

In Step1, we define a metric called security index, indicating how often street crimes occurred within a town block compared to other blocks. Inspired by ABC analysis [10] in materials management, the security index divides town blocks into three categories: A (crimes rarely occur), B (crimes sometimes occur), or C (crimes frequently occur).

Let SI(by) denote the security index of a town block b for a year y. To calculate SI(by), we first count the number of incidents occurred (i.e., frequency) within every block \(b_i\) within the city for y, denoted by \(f(b_i, y)\). Next, for each value of frequency \(n (n=1,2,3,\ldots )\), we count the number B(n) of town blocks where \(f(b_i, y) = n\). We then calculate the cumulative composition ratio of B(n) for \(n = [1\ldots max(n)]\). This is to identify two thresholds \(n_1\) and \(n_2\) that determine categories A, B and C. There are no fixed thresholds for \(n_1\) and \(n_2\). However, for example, \(n_1\) is determined at the point where approximately 70% of the town blocks are involved in category A. \(n_2\) is determined at the point 20% of the rest belongs to B. Finally, SI(by) is determined by:

$$ SI(b,y) = \left\{ \begin{array}{l} A\,\cdots \,(0< f(b, y) \le n_1)\\ B\,\cdots \,(n_1< f(b, y) \le n_2)\\ C\,\cdots \,(n_2 < f(b, y)) \end{array} \right. $$
Fig. 1.
figure 1

Relationship between the incident frequency and the number of town blocks in Kobe in 2016

Fig. 2.
figure 2

Relationship between the incident frequency and the number of town blocks in Kobe in 2017

Figures 1 and 2 respectively show the number of town blocks B(n) w.r.t. the crime frequency n in year 2016 and 2017. The curve in the chart represents the cumulative composition ratio.

It can be seen in the figures that more than 70% town blocks with incidents has only one crimes. It is also seen that the number of town blocks exponentially decreases as the frequency increases. Based on the observation, we set \(n_1 = 1\) and \(n_2 = 2\). Thus, the security index divides the town blocks, where blocks with only one incident occurred in a year as A, blocks with two incidents as B, and blocks with three or more incidents as C.

In Step 2, we represent the increase and decrease of street crimes of each town block for two years. More specifically, for a given town block b, and two years y and \(y' (>y)\), we use \(SI(b,y)\,.\,SI(b,y')\) to represent the change of security, where . is a concatenation operator. Since SI() takes a value of A, B, or C, we have totally nine combinations from AA, AB,..., CC. For example, AB represents the change of a town block, where the first year is A and the second year is B. Thus, AB represents that the situation got worse a little bit. Similarly, CA represents that the street crime was dramatically decreased. In this paper, we set \((y, y') = (2016,2017)\) to examine the change of street crimes from 2016 to 2017.

In Step3, we calculate the population growth rate in two years y and \(y'\). Let p(bym) denote the population of a town block b at month m of year y. Then, the population growth rate \(r(b, y, y')\) of b between y and \(y'\) is defined by:

$$\begin{aligned} r(b,y,y') = \frac{p(b, y',m)-p(b, y,m)}{p(b,y,m)}\times 100 \end{aligned}$$

where m is a certain month. p(bym) and \(p(b,y',m)\) can be easily obtained by getPopulation() of Kobe Demographics API described in Sect. 3. In this analysis, we set \((y, y') = (2015,2017)\) and \(m=September\).

To represent the magnitude of population growth, we divide town block into the following seven categories:

  • [stable] \(\cdots -r_0 \le r(b,y,y') < r_0\)

  • [a bit decreased] \(\cdots -r_1 \le r(b,y,y') < -r_0\)

  • [decreased] \(\cdots -r_2 \le r(b,y,y') < -r_1\)

  • [significantly decreased] \(\cdots r(b,y,y') < -r_2\)

  • [a bit increased] \(\cdots r_0 \le r(b,y,y') < r_1\)

  • [increased] \(\cdots r_1 \le r(b,y,y') < r_2\)

  • [significantly increased] \(\cdots r_2 \le r(b,y,y')\)

According to the value of \(r(b, y, y')\), we categorize a town block b into one of the above seven classes. In this paper, we set \(r_0\) as 1.0, \(r_1\) as 5.0, and \(r_2\) as 10.0 respectively.

Finally, in Step 4, we see the relationship between the change of crime and the change of population growth. More specifically, we create a table, where a row represents the change of the security index, and a column represents a class of population growth. Each cell represents the number of corresponding town blocks with respect to the security index and the population growth.

Table 1. Relationship between change of the security index and the population growth

Table 1 summarizes the table. Rows AC, BC, AB represent the city blocks where the number of crimes increased from 2016 to 2017. Rows CC, BB, AA represent city blocks where the situations did not change. Rows BA, CB, CA represent city blocks where the number of crime decreased. From the table, the following facts can be observed:

  • The number of town blocks belonging to BA, CB, CA is larger than the number of those belonging to AC, BC, AB. This indicates that the total number of recognized crimes decreased from 2016 to 2017.

  • The number of town blocks with AA is large. It means that in a city block where the crimes rarely occurred in the previous year, the crimes rarely occur in the next year.

  • In the town blocks where the population significantly increased (see the last column), the number of town blocks where crimes decreased is larger than those of increased.

4.2 Analyzing Relationship Between Population Size and Street Crime in Town Blocks

In order to answer RQ2, we analyze the relationship between the street crimes and population size for every town block in Kobe city. The procedure of the analysis consists of the following steps:

  • Step 1: Calculate the security index of each town block in Kobe city

  • Step 2: Look up the increase and decrease of street crimes within each town block between two years

  • Step 3: Examine the population of each town block at a certain period

  • Step 4: Analyze the relation between crimes and the population

Step 1 and Step 2 are same as Sect. 4.1. We calculate the security index of each town block in Kobe city, and represent changes in the security index by two letters of the alphabet.

In Step 3, we examine the population of each town block at month m of year y. Let p(bym) denote the population of a town block b at month m of year y. p(bym) can be easily obtained by getPopulation() of Kobe Demographics API described in Sect. 3. In this analysis, we set \(y = 2017\) and \(m = September\). Moreover, we divide town blocks into the following three categories:

  • [large population] \(\cdots p_0 \le p(b,y,m)\)

  • [middle population] \(\cdots p_1 \le p(b,y,m) < p_0\)

  • [small population] \(\cdots p(b,y,m) < p_1\)

According to the value of p(bym), we categorize a town block b into one of the above three classes. In this paper, we set \(p_0\) as 514, \(p_1\) as 187 so that the number of town blocks included in the three classes becomes same.

Finally, in Step 4, we see the relationship between the change of crime and the population. In the same way as Sect. 4.1, we create a table, where a row represents the change of the security index, and a column represents a class of population. Each cell represents the number of corresponding town blocks with respect to the security index and the population. Table 2 summarizes the table.

Table 2. Relationship between security index and population in town blocks

From the table, the following facts are observed:

  • Overall, more town blocks with large population appear in the table than town blocks with small population. It turns out that crimes rarely occur in town blocks with small population.

  • In town blocks with large population, there are more town blocks belongings to BA, CB, CA than AC, BC, AB.

  • All of town blocks belongings to AC have large population.

4.3 Analyzing Distributed and Related Among Micro Town Blocks About Street Crimes

We think about RQ3 mentioned in Sect. 1. In order to answer to RQ3, we develop a system that divides Hyogo Prefecture into mesh-like areas and visualizes based on the frequency of street crimes in the area. We show a screenshot of the system in Fig. 3.

Fig. 3.
figure 3

A screenshot of prototype tool; showing occurrence of street crimes in Kobe area, since 2016-01-01 until 2016-12-31 (Color figure online)

Fig. 4.
figure 4

Showing occurrence of street crimes in Kobe area, since 2016-01-01 until 2016-12-31 (Color figure online)

We explain the flow of visualization by the system. First, the system divides Hyogo Prefecture into mesh-like areas and counts the number of incident within each area. Next, the system colors each area with 10 colors according to the frequency. The area displayed in deep red has the high frequency in Fig. 3. The number of incident decreases in the order that the color changes to orange, yellow, yellowish green, and green. Incidents hardly occur in the uncolored areas. In the system, the user can specify aggregation period of the incident and the length of the side of areas.

For example, by changing the aggregation period, it is possible to visually analyze where incidents frequently occurred in each period. The frequency in the center of Kobe city in 2016 is visualized in Fig. 3. Similarly, the frequency in 2017 is shown in Fig. 4. From Figs. 3 and 4, the following facts are observed:

  • Incidents occurred intensively in one area in 2016

  • There are multiple areas where many incidents occurred in 2017

  • In 2017, not only in the vicinity of the station in the center of the figure, but also in some slightly distant areas are colored in yellow.

5 Discussion

We answer RQ1 to RQ3 mentioned in Sect. 1.

Regarding RQ1, from Table 1, the number of street crimes often decreases or does not change in the town blocks with increasing population. This result is the same as the city or ward level analysis. Regarding RQ2, the number of street crimes has a tendency to decrease in the streets with large population from Table 2. It is the opposite result to RQ2. In analysis we conducted in this paper, we used incident data only in 2016 and 2017 only. We also use population data in two years. We would like to consider analysis using more data in the future.

Regarding RQ3, it is found that many street crimes occurred around the central station from Figs. 3 and 4. On the other hand, there are a few areas where many street crimes occurred even if the area is a little far from the station. Also, the location where many incidents occurred are quite different when we compare the two figures. However, we also used data only in 2016 and 2017 in this analysis. Therefore, it is necessary to use more data and conduct analysis by changing the aggregation period in the future.

6 Conclusion

In this paper, we analyzed the street crimes from a micro viewpoint by integrating incident data from Hyogo Prefectural Police and population data of Kobe city. Focusing on the number of street crimes in 2016 and 2017, we examined the relation between street crimes and population in town block level. We also develop a system that divides the area into mesh-like areas and visualizes the frequency of street crimes. Analyzing the street crimes from a micro viewpoint, a local government can take precise policy actions. A police department can also benefit the micro-level analysis for efficient patrolling within the area.

We used incident data and population data for two years at this time. We plan to conduct analysis using more data as a future work.