Keywords

1 Introduction

Urban planning, building prototype designing or interior design are now becoming more common in fast growing industry. This inspires the authors to build a system that help the prototype designing feature more effective and intuitive by using Virtual Reality technology and Leap Motion. The system also provides interactive analysis communication between investors and the architectures. Since the customers are usually not sure about what they really want from the architectures; it is the architectures job to design and propose the idea to the customers via meetings and presentations; this process usually takes up from weeks to months. The authors propose a system that allow architectures to be able to present the prototype to the investors, receive their feedbacks and adjust the prototype instantly at real time without having to leave their offices. This helps save a lot of time during the customers (usually investors) analysis procedure to come up with an approval between investors and architectures.

Furthermore, the system also allows collaborating between architectures; each architecture or team will contribute to separate areas to avoid conflicts between members and can commit their work easily in real time; they should see the whole virtual environment gets updated as they commit their work. This feature is innovative since it provides intuitive perspective in designing construction prototypes. Hence, the team can easily compromise in designing and come up with an agreement of the construction prototype in real time.

There are many visual and computational tools and applications for city planning (or a general system); there are even tools for collaborating, team-working between multiple computers. However, despite all the visual technology they can display on a computer, it would be more productive and innovative if architectures can see the whole project in a real 3-dimensional environment and modify or make adjustments using hand gesture instead of mouse manipulation. By using gesture, the system provides users an interactive experience with the model; especially, manipulating with gesture is easier in the virtual world. In this article, we propose a system called City Planning for architectures in which they can visualize the whole city planning project in a virtual 3-D world by applying virtual reality technology and interact with them via gestures. The authors decide to use Oculus Rift as the virtual reality device and Leap Motion for gesture recognition. The contributions are as follow:

  • Architects can collaborate, discuss and design city together in real-time without having to leave their offices. All of the actions are synchronized between all users. Each member or team will contribute to an area or part of the project; they can commit their work separately to avoid conflicts. The environment will get updated once they have committed the changes.

  • The system can also be used to present the project to the clients. The system simulates a virtual environment of the city project in front of users. Users can communicate with each other by voice-call application such as Skype, Hangout… For architecture users, they can modify, insert, delete objects… i.e. they have complete control over the system. On the other hand, client users such as investors, contractor or venture capitalists…are given a limited amount of task such as rotating, zooming and pointing at object. Therefore, client users can give their feedbacks on the project that is being presented and architectures or designing team can adjust the prototype instantly until both sides agree on the design.

In this article, the authors walkthrough step-by-step technique in building such system. Section 2 introduces the background concept and related works. Section 3 introduces the proposed system in general. Section 4 describes the architectures that are used to develop the application. Section 5 presents techniques to recognize user’s hand-gestures. Section 6 gives an insightful view of the applications and scenarios of usage of the system. Conclusion and future work are discussed in Sect. 7.

2 Background & Related Works

2.1 Virtual Reality Technology

Virtual reality (VR) is described as a 3-dimensional, computer generated environment which can be explored and interacted with by a person [12]. The technology has been around since the 90 s but is not well-known by the community. Since 2010, with the vast technology outburst of the 21th century, large tech corporations such as Google, Facebook, Sony, Samsung, etc. started to invest in this technology and make it available worldwide to basic users. In 2014, Facebook bought Oculus VR for $2 billion, Google introduced Cardboard and Sony announced its Project Morpheus (code name for Play Station VR). In 2015, Samsung released Gear VR [8]: its first portable VR device that comes with Samsung smartphones. Realizing the future of VR technology in the next few years, the authors decide to integrate this beautiful technology in this project in order to bring an immersive graphics and intuitive experience to users. Furthermore, VR has been applied in many other fields such as education and healthcare [11].

2.2 City Planning

City planning, urban zoning, designing buildings, structures, etc. have always been a time-consuming task. There are many phases and procedures one has to go through from proposing ideas, seeking for investment, presenting, adjusting the designs, etc. before their ideas got approved and they can launch the construction. These procedures usually take up to months or even years. And many projects got rejected from funding because the investors could not realize the potential of these projects; many of which are due to lack of visualization and intuitive observation of how the projects look like in reality.

There are also many aspects to consider in planning constructions, such as the physical environment (location, climate, resources, etc.), the social environment (ordering the city to proper area to satisfy the social needs) and the economic environment (supporting to businesses) [1]. It is necessary to inspect these variables by a 3D rendering application. VR is also researched to be applied in urban or city planning many times before as in [1214].

There has been many software systems that support city planning. Esri City Engine [5] appears to be one of the most popular and powerful. It is an advanced city planning software, which has many essential and useful features. We can get the overview of the city from any angle, many options included such as lighting, shadowing, skybox. The content of the city can get to the tiny detail, we can have flyovers, bridges, parking lots, harbors, etc. The details are also really close to realistic. Figure 1 shows the main UI of Esri City Engine.

Fig. 1.
figure 1

Esri city engine overview

Cybercity3D [6] is also a very popular software that support city planning. This software focuses on creating realistic 3D models from real world. These models can be imported in different 3D rendering engines including Esri City Engine, Unity Engine.

In this paper, we focus only on how we create a simple prototype of a city or urban area, which is a full representation of a city. However, by integrating VR in this project, we want to put city planning to a new level; that is, to give users immersive experience by letting them interact with the model in a 360-degree visualization. Or further more, help users inspect their project as a first person perspective, i.e. let them walk around streets and buildings with real physics. A real city would have a lot of details, from the basic ones such as roads, buildings, trees to more complex ones like road intersection, roundabout, flyovers, signs, traffic lights, etc. To keep it simple, we only put into the application the most basic components: Roads, Buildings and Trees.

2.3 Collaboration in Human Computer Interaction

While VR has been widely used, even to support robotic system for the disabled [11], there will be even more applications if people in a VR system can collaborate. Collaboration in Human Computer Interaction can be applied in many fields: learning, designing, simulating, presentation, etc. Huang et al. [2] attempted to build a collaborative virtual reality learning system for medical education. Dos Santos et al. [3] proposed a multi-user interacted simulation of the oil and gas workflow to predict the outcome of an experiment before it goes to the real life. Another interesting application is the collaborative big data visualization and exploration system in virtual reality from Donalek et al. [4]. It helps the data mining process becomes easier and more natural (Fig. 2).

Fig. 2.
figure 2

From left to right: The collaborative virtual environment in medical learning, in oil and gas workflow simulation and the big data visualization and exploration.

Inspired from the idea of collaborative virtual environment, we wish to help architectures and designers to innovate their ideas together in VR. Plus, by integrating collaborative contributing feature, we make the communication between architectures and clients become easier. We give clients and investors the ability to experience the city prototype, give their ideas, feedbacks and watch the architectures adjust the prototype instantly. Hence, shorten the time of project analysis between architectures and investors. There will be less presentation slides and pictures in those fund raising meetings; but instead an intuitive way to experience the project with VR.

3 Proposed System

Our idea is to established a 3-dimensional environment in which users are connected together via internetworking protocols. Users are categorized into different classes; each class is provided with different permissions and controls over the model. In this paper, we introduce two primary user classes: Architecture and client. Architecture users are able to inspect (rotating, zooming), modify, insert, remove objects; client users are more limited since they are only allowed to inspect the model. We use Unity 3D Game Engine as our 3D renderer; Oculus Rift and Leap Motion technology are the visualization and gesture recognition protocols respectively.

3.1 Unity 3D Engine

We use Unity as our 3D renderer engine since it provides a flexible powerful development platform for creating 3D environment. We can easily define building models or import models from other 3D platform. Unity also support us with physics and visual effects so we can integrate and inspect in our environment. In addition to its powerful 3D support, Unity also allows multiplatform development which give us the ability to export our work on multiple operating systems, especially mobile platforms.

3.2 Oculus Rift with Leap Motion Technology

The combination of VR and gestures recognition in our project brought user experience to a whole new level. We decide to use Oculus Rift as our VR device since it is well integrated with Unity and its handful SDK support for multiple platforms from Windows, Mac OS to mobile. We use Leap Motion technology as our primary gestures recognition device as Leap Motion supports detail detection of the arm and hand structures like fingers and joints. Leap Motion also provides recognitions of simple hand gestures such as swipe, screen tap, key tap, circle; these gestures are then translated into control commands like mouse click, mouse drag, or screen zoom.

3.3 Application Overview

Figure 3 shows the UI overview of the application we built. There are 4 main components:

Fig. 3.
figure 3

The UI overview of the application. Click item to reveal other modes like selection, manipulation, etc.

  • Item box: appears in Item mode, shows list of the available items (buildings, trees, roads, etc.). When we click on an item, they will be active (to be drawn). We can choose many items to mix them together when drawing down onto the map.

  • Properties box: appears when we select an item, showing their properties.

  • Manipulation box: appears when we change to Manipulation mode, shows the list of controls you have, including selecting, transforming, scaling, etc.

  • Users list: list of connected users. When we click on someone, we can see their perspective.

3.4 Usage Workflow

Our target is to provide a fast way to create a city planning prototype. To do this, we focus on dividing the city in multiple areas. Each area can be auto generated, instead of putting items one by one, with custom properties like list of models to be generated, probability of occurrence, etc.

Figure 4 illustrates the normal usage workflow of the application. The main 4 steps are: design the roads system, generate city areas, then modify the details (change, add, delete items) and merge areas together. In that way, multiple users can control different areas independently, then connect their work afterward.

Fig. 4.
figure 4

Normal usage workflow

Designing Roads System.

Designing realistic roads system is a very difficult problem. In this application, we only consider a simple kind of road, with fixed width and straight direction. Intersections are automatically generated if two roads cross each other.

Generate City Area.

This is the most important part of the usage of the application. You can auto-fill an area of the city, with custom-define sized, either big like the whole city or small like a tree zone. The generated items are automatically stepped out from the roads.

4 System Architecture

4.1 Architecture Overview

Figure 5 demonstrates the general architecture of the application. There are 4 main components: The Object Raw API, the Raw Network API, the Event System and the Items System.

Fig. 5.
figure 5

The general architecture of the application

Object Raw API.

This is the abstraction of the item manipulation on the map. It provides the raw item control, such as put item on a specific position, with a specific transform, scale; check for object position violation, etc.

Raw Network API.

This is the abstraction of networking events, responsible for sending and receiving messages from the server or other clients.

Event System.

The higher abstraction of networking. It provides methods of event manipulation, event that can be delivered to the other clients. It uses the Raw Network API inside.

Items System.

The abstraction of all items in the city (buildings, trees, roads, area, map, etc.). The Items System defines the base and the specific methods and properties of every items (e.g. trees and buildings has different methods and properties). The Items System uses the API provided by the Object Raw API and the Event System.

4.2 Editing System

The Editing System here refers to the system of editing, manipulating the city plan, including the item sets can be drawn on the map, the selection tool and the manipulation tool.

Figure 6 shows the hierarchy of items. Each kind of item has distinct methods and properties. Map is the overall map (or world), it is also an item which we can change the perspective, lighting, etc. Area is the one we can define after creating map, each area has the method of auto-fill with custom Shapes. Shape is the abstraction of buildings and miscellaneous things like trees, signs, etc.

Fig. 6.
figure 6

The hierarchy of items

The selection and manipulation tool is also categorized to each kind of item. We have distinction between Map selection, Area selection, Road selection and Shape selection. This helps us increase the accuracy of picking what we want (e.g., selection between area and shape can be messed up).

4.3 Networking System

There are 2 kinds of event: DataEvent and ViewingEvent. DataEvent is the kind of event that affect the City data, such as moving a position of an item, rotating, changing specific properties. ViewingEvent is the special event for the purpose of presentation, sharing your perspective to others. We use TCP/IP to transfer DataEvent messages, to ensure data commitment. UDP is used in transferring ViewingEvent messages, as it need speed but not necessary integrity of data.

Figure 7 describes the basic event network flow. DataEvent will be sent to the Server, which will then be checked for conflict and stored in the database. Meanwhile, the ViewingEvent will be send peer to peer, directly to other clients, as it not necessary to store that information. Both kind of events when received will be pushed into something called Pool of events that each client has. Those events will sequentially be applied to change the offline city data in each client. To check the network status, every second a client will send an acknowledgement message to the server or the other client and reverse. If there is no acknowledgement after 4 s, each side will decide the other side is out of connection.

Fig. 7.
figure 7

Event network flow

5 Gesture Recognition and Translation

5.1 Navigating and Controlling Mode

In this section, we also introduce two primary navigating and controlling modes of the system, the view mode and the locating mode. In view mode, user can rotate the system by 360 degrees in space and zoom in/out the system. In this mode, user is provided the ability to view every aspect of details in the system from any angle.

The second mode is locating mode. In this mode, user is given the ability to locate objects onto the city such as buildings, trees, etc. as well as adjusting their rotations.

5.2 Gesture Recognition and Translation

Gestures Recognition.

The authors define 3 primary gestures in order to control and navigate the system: grab gesture, circle gesture, screen-tab gesture. Circle gesture and screen-tab gesture are recognized using built-in Leap motion API while grab gesture is recognized by calculating the distance between finger joints as discussed below.

Grab Gesture.

Grab gesture is recognized when the user hand is holding as a fist. Leap Motion technology provides us with strong recognition of the hand and arm structure including positions of joints and fingertips. As the hand is holding, the thumb fingertip will be close to the joints of other fingers. Therefore, by traversing and calculating the distance between the thumb fingertip and other fingers’ joints, we can check weather the user is holding his hand to grab the object.

Grab gesture is translated as the action of pressing and holding the left mouse button (mouse dragging). User uses this gesture to rotate the city model or dragging objects.

Circle Gesture.

Circle gesture is recognized by using Leap Motion built-in gesture API [10]. User triggers this gesture by rotating their hand in a circle shape at a fast enough speed.

As in view mode, user can use this gesture to zoom in or out the city model by circling clockwise or counterclockwise respectively. In locating mode, user users this gesture in order to adjust the rotation of the objects. Each clockwise circle will rotate the object by 90 degrees clockwise and each counterclockwise circle will rotate the object by 90 degrees counterclockwise.

Screen-Tap Gesture.

Screen-tap gesture is also provided by Leap Motion built-in gesture API [10]. Screen-tab gesture is identified when the tip of a finger pokes forward and then springs back to approximately the original position. Screen-tap gesture is translated as a mouse click. User users this gesture in order to select any object in locating mode or interact with the application GUI interface.

6 Scenario of Usage

6.1 Improve Client-Architecture Design Development

Figure 8 demonstrates the design development procedure of an architecture. Usually this process takes up from weeks to months since most clients are not sure what they really want from their ideas. Furthermore, in the professional industry, this procedure has to go through many layers between the architectures and the client.

Fig. 8.
figure 8

Design development procedure

Fig. 9.
figure 9

Design development directly in real time with city planning

With City Planning, we can shorten the time for design development by allow direct communication, instant modifications between architectures and clients; furthermore, we bring immersive user experience to both client and architecture. This way, the client can have an intuitive view of the project instead of viewing it over presentation slides and images. This scenario is illustrated in Fig. 9.

6.2 Productive Collaborating Between Architectures

Another scenario of usage is collaboration architecting as discussed above. We are inspired by the idea of multiple software engineer working on the same project. Each member or team will work on different area of the city and commit their work separately. The server will check for conflicts, update the changes and notify all of its clients about the new changes; this will update environment of all users as well. With the integration of VR and gestures controlling, we hope to build a new frontier of working environment that inspires collaboration and innovation between users. Figure 10 illustrates the system as well as the permissions given to each class of user.

Fig. 10.
figure 10

Collaborating between architectures and clients

7 Conclusion and Future Work

We believe this system will open a new frontier in modern city zoning and urban planning. By combining virtual reality and motion detection technology, we hope to build the most productive working environment that inspires collaboration and innovation among users. For future works, this system can be improved to let users inspect the environment as first person or to expand into an interior designing application.