# A Real-Time Multiaperture Omnidirectional Visual Sensor Based on an Interconnected Network of Smart Cameras Kerem Seyid, *Student Member, IEEE*, Vladan Popovic, *Student Member, IEEE*, Omer Cogal, *Student Member, IEEE*, Abdulkadir Akin, Hossein Afshari, *Student Member, IEEE*, Alexandre Schmid, *Member, IEEE*, and Yusuf Leblebici, *Fellow, IEEE* Abstract—Centralized and multilevel implementations of the Panoptic omnidirectional multiaperture visual system were previously presented by us, relying on the transmission of all camera outputs to a single central processing node for omnidirectional image and video reconstruction. In this paper, a novel distributed and parallel implementation of the omnidirectional vision reconstruction algorithm of the Panoptic system is presented. The parallel approach aims to overcome the scalability problems and memory bandwidth limitations of the centralized approach. The real-time hardware implementation is presented for camera modules with image processing, memory, and interconnectivity features. A methodology is introduced for the arrangement of camera modules with interconnectivity feature into a target interconnection network topology. A unique custom-made multiple-field-programmable gate array hardware platform is introduced for the implementation of an interconnected network of 49 camera prototype Panoptic system. A hardware architecture based on presented hardware platform enabling the real-time implementation of the blending algorithms is presented, along with the imaging results and resource utilization. The realtime implementation results of the implemented omnivision application on the mentioned prototype are demonstrated. Index Terms—Field programmable gate array, image reconstruction, real-time systems, smart cameras. #### I. Introduction DOMINANT trend in constructing high-end computing systems consists of parallelizing large numbers of processing units. A similar trend is observed in digital photography, where multiple images of a scene are used to enhance the performance of the capture process. It is called a multiview imaging system and it has attracted increasing attention because of the consistently diminishing cost of digital cameras [1]. Novel research themes and applications such as increasing image resolution [2], obtaining high dynamic range Manuscript received January 8, 2014; revised June 19, 2014; accepted September 2, 2014. Date of publication September 8, 2014; date of current version February 4, 2015. This work was supported by the Swiss National Science Foundation through the Project entitled An Omnidirectional Multiaperture Visual Sensor under Grant 200021-122037. This paper was recommended by Associate Editor S.-Y. Chien. The authors are with Microelectronic Systems Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland (e-mail: kerem.seyid@epfl.ch). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSVT.2014.2355713 images [3], [4], object tracking/recognition, environmental surveillance, industrial inspection, three-dimensional television, and free viewpoint TV [5] are receiving increasing attention. Early systems for capturing multiple views were based on a translating [6] or rotating [7] high-resolution camera for capturing the frames, while rendering is accomplished in postprocessing. The latter concept requires a long acquisition time. These ideas were later extended to a dynamic scene by using a linear array of still cameras [8]. For capturing large data sets, researchers focused on arrays of video cameras. In addition to the synchronization of the cameras, very large data rates present new challenges for the implementation of these systems. The first camera array systems were only built for recording and later postprocessing on personal computers (PC) [9]. Other such systems [10], [11] were built with real-time processing capability for low resolution and low frame rates. A general-purpose camera array system was built at Stanford University [12] with limited local processing at the camera level. This system was developed for recording large amounts of data and with intensive offline processing, but not for real-time operations. Full view or panoramic imaging is likely to find applications in various areas such as autonomous navigation, robotics, telepresence, remote monitoring, and object tracking. Several solutions for acquiring omnidirectional images and their application have been presented in [13]. In [14] and [15], real-time systems with six cubically arranged cameras are presented. These systems use high-resolution imagers with a low number of cameras. Another six-camera panorama system with high-resolution output is presented in [16]. Google Street View is one example of high resolution and increased number of cameras. The System in [17] is a 360° imaging system comprising 15 cameras with 5 MP resolution, which covers 80% of its surroundings. Lately, a novel system consisting of 44 cameras with 5 MP resolution has been presented in [18], which offers an output resolution over 82 MP with offline processing. Another camera system which is able to acquire an image frame with more than 1 Gigapixel resolution was developed [19]. The system uses a very complex lens system comprising a parallel array of microcameras to acquire the image. Because of the extremely high resolution of the image, it suffers from a very low frame rate, even at low output resolution. Recently, a method for implementing bioinspired cameras Fig. 1. (a) Network-based Panoptic prototype with five floors and 49 cameras. The sphere diameter of the prototype is $2r_{\odot} = 30$ cm. (b) Top view of the Panoptic Media FPGA-based development platform. with hemispherical view is presented in [20]. However, it is limited to only 180 pixels. A new approach for creating a multicamera system distributed over a spherical surface is presented in [21]. This new multicamera system is referred to as the Panoptic camera. The Panoptic camera is an omnidirectional imager capable of recording light information from any direction around its center. It is also a polydioptic system where each CMOS camera sensor has a distinct focal plane. The previously built Panoptic system is explained in detail in [22]. The system is implemented with a centralized approach, where data acquisition and data processing reside on the same unit. Fig. 1 shows the new Panoptic Media Platform of five floors and 49 cameras. The new prototype and architecture presented in this paper aim to implement the reconstruction algorithm in parallel and distributed fashion where image processing applications reside at the camera level. Detailed explanations of the distributed and parallel implementation of vision reconstruction are given in Section II. A definition of interconnected network of cameras and a methodology to solve the camera assignment problem is given in Section III. Details of a custom-made field-programmable gate array (FPGA) platform designed for the practice of the concept of an interconnected network of cameras are given in Section IV. Hardware implementation and imaging results are given in Section V. The conclusion and future work are presented in Section VI. ## II. DISTRIBUTED AND PARALLEL IMPLEMENTATION OF OMNIDIRECTIONAL VISION RECONSTRUCTION Earlier systems such as [22] was implemented with a centralized approach, where a single unit is responsible for data acquisition and data processing. However, in the centralized approach, because of the input-output (I/O) constraints, the number of cameras that can be connected to a single node is limited. Furthermore, for high number of cameras and high camera resolutions, the memory bandwidth requirement increases significantly. Therefore, centralized approaches for such systems may create bottlenecks and limit the scalability. Parallel approaches aim to overcome these limitations by connecting the camera nodes into several units instead of a single unit as well as distributing the workload evenly and in parallel among these units. Moreover, parallel approaches generally yield a faster solution compared with centralized approaches, which creates the possibility of creating higher resolution images beyond that achievable by the centralized approach. Because of the constraints posed by technology, the distributed and parallel approach can be a feasible solution for the real-time realization of such systems. In this paper, a novel distributed and parallel implementation of the Panoptic camera is presented. Contrary to the previous systems presented in omnidirectional reconstruction, this novel algorithm and system aims to parallelize the reconstruction algorithm by distributing and parallelizing the tasks to several nodes. The goal is to overcome the physical limitations introduced by the centralized approach. Each individual node is responsible for creating its assigned part of the omnidirectional output frame with the help of the neighboring cameras. In previously published works, a single unit was responsible for creating the whole reconstructed image. This new method allows to increase the number of cameras without additional burden on the reconstruction algorithm, where a centralized approach that was strictly dependent on the number of cameras. Furthermore, increasing the number of cameras reduces the amount of work per node, allowing a higher resolution and higher frame rate solutions. In this new approach, each node is a single omnidirectional reconstruction system. Thus the features and capabilities of the individual camera nodes must be enhanced which were previously used only for capturing data [22]. Hence, smart camera nodes should possess processing and communication capabilities. The processing capability enables the camera module to perform local processing down to the pixel level, while communication features permit light intensity information exchange among the camera modules. To this aim, a method for creating a regular network topology for camera network is also presented. Converting an irregular topology into a regular topology is conducted to generalize the problem, regardless of the source network topology and the camera arrangement in the physical hemisphere dome. #### A. Distributed and Parallel Implementation For the detailed analysis of the omnidirectional vision reconstruction algorithm, reader is referred to [22]. In the distributed implementation of the omnidirectional algorithm, each *i*th camera must possess the knowledge of its covering directions and the information of the other contributing cameras for all of these directions. This information can be extracted using the internal and external calibration processes of the Panoptic system. After extracting the camera parameters, such as the camera direction vectors and coordinates on spherical surface, angle of views of each camera, and so on, each camera can construct its responsible portion of omnidirectional view independently. For instance, in the nearest neighbor technique, the best viewing camera for each $\vec{\omega}$ is selected. Hence in this technique, each camera constructs a unique set of observation directions. The set of observation directions of each camera has no intersection with the other cameras of the Panoptic system in the nearest neighbor method. Therefore, camera modules #### TABLE I CLOCK FREQUENCY DEMAND FOR REAL-TIME OMNIDIRECTIONAL VISION RECONSTRUCTION ( $F_{\text{clkprc}}$ ) IS EXPRESSED IN MHz. IMPLEMENTING THE LINEAR INTERPOLATION METHOD WITH $K(n=4\text{ Cameras per }\vec{\omega})$ . The Total Memory Bandwidth Demand Is Expressed in Terms of Gb/s. The Bandwidth Values That Are Supported With the State of the Art Current static random-access memory (SRAM) Technology Have a Green Color Background and the Ones That Are Not Supported Have a Red Color Background | Reso | lution | $N_{cam}$ | $N_{pix}$ | $I_w$ | $I_h$ | $F_{ps}$ | $N_{cam}$ | $N_{pix}$ | $I_w$ | $I_h$ | $F_{ps}$ | $N_{cam}$ | $N_{pix}$ | $I_w$ | $I_h$ | $F_{ps}$ | |-----------------------|-------------------------|-----------|--------------|-------|-------|----------|--------------|-----------|-------|-------|----------|--------------|-----------|-------|--------|----------| | $N_{\theta} N_{\phi}$ | | 20 | 8 | 352 | 288 | 25 | 50 | 16 | 640 | 480 | 30 | 100 | 24 | 1280 | 960 | 60 | | 1νθ | $IV_{\theta} IV_{\phi}$ | | $F_{clkprc}$ | | MBW | | $F_{clkprc}$ | | | MBW | | $F_{clkprc}$ | | | MBW | | | 64 | 64 | 2 | 2 | | 0.41 | | 6 | 5 | | 7.38 | 3 | 2. | 5 | | 176.0 | 7 | | 128 | 128 | 8 | 3 | | 0.42 | 2 | 2 | 5 | | 7.40 | ) | 9 | 8 | | 177.0 | 4 | | 256 | 256 | 33 | | | 0.45 | | 98 | | | 7.50 | | 393 | | | 177.32 | | | 512 | 512 | 131 | | | 0.57 | | 393 | | | 7.88 | | 1573 | | | 178.46 | | | 1024 | 1024 | 52 | 24 | | 1.08 | 3 | 15 | 73 | | 9.39 | ) | 62 | 91 | | 182.9 | 9 | | 2048 | 2048 | 209 | 97 | | 3.09 | ) | 62 | 91 | | 15.4 | 3 | 251 | 66 | | 201.1 | 1 | Fig. 2. High-level model of an interconnected network of cameras. All cameras $C_i$ are connected via interconnection network and some cameras have direct access to central unit. can be limited to observe solely their own set of directions and construct their portions of omnidirectional vision, independently from each other. In the linear interpolation technique, similar to the nearest neighbor technique, each camera can still be assigned to the task of vision reconstruction for its particular partition. For this purpose, each camera would need the information about which other cameras contribute to the particular $\vec{\omega}$ and the intensity values obtained by the contributing cameras. For a constant set of $\vec{\omega}$ directions, these parameters are only required to be calculated once and are stored in a local memory for real-time access. The required information can be calculated once by the central unit and updated to the local memory of the camera modules. Alternatively, each camera module can calculate its own required information using its own processing features. In contrast to the previous centralized approaches pertaining to omnidirectional light-field reconstruction algorithms, a novel distributed and parallel algorithm for image reconstruction is implemented. Assuming that all cameras have signal-processing capability and a network-based communication media that permits data exchange with other cameras and a central unit as shown in Fig. 2, omnidirectional vision reconstruction algorithms can be realized in a distributed manner among camera nodes. #### B. Processing Demands The proposed architecture in [22] performs the omnidirectional vision reconstruction in a pipeline flow for both the nearest neighbor and the linear interpolation techniques. Assuming that the memory used in the system can sustain consecutive access cycles, $F_{\rm clk}$ for the presented real-time omnidirectional vision reconstruction architecture is derived from $$N_{\rm acs} \times {\rm frames/s} + T_{\rm lat} \le F_{\rm clk}$$ (1) where $N_{\rm acs}$ is number of total memory access for reconstruction and fps corresponds to frame rate. For approximations, the latency term $T_{\text{lat}}$ in (1) can be neglected. The maximum number of access time is $N_{\rm acs} = N_{\rm cam} \times N_{\theta} \times N_{\phi}$ where $N_{\theta} \times N_{\phi}$ is the output resolution, $N_{\text{cam}}$ is the number of cameras contributing to image reconstruction in linear interpolation method. The worst case occurs in case when the cameras contribute in all directions for the linear interpolation technique. Assuming that all data flow occurs in the data processing module with $N_{pix}$ bit presentation, processing $F_{clkprc}$ demands for the proposed centralized system are stated in Table I for three combination sets of $N_{\text{cam}}$ , $N_{\text{pix}}$ , camera image frame width $(I_w)$ , camera image frame height $(I_h)$ , frames/s and different reconstruction resolutions $N_{\theta}$ , $N_{\phi}$ . The necessary memory requirements for the whole system, including image frame capturing originated from cameras and omnidirectional vision reconstruction is presented. The aggregate of the latter two demands is translatable into the memory bandwidth requirement of the system using the multiplying factor of $N_{pix}$ . The bandwidth is calculated as $$(N_{\text{cams}}I_wI_h + KN_\theta N_\phi)F_{\text{ps}}N_{\text{pix}}$$ (2) where K is the number of contributing cameras per $\vec{\omega}$ direction in (2). The state of the art current technology for SRAM memories is the Quad Data Rate II SRAM (QDRII-SRAM) with up to 900 million random transactions per second [23]. The maximum data width of a single QDRII SRAM memory is 36 b [23]. Hence, the maximum bandwidth of a state of the art SRAM memory is 26 Gb/s. The bandwidth values that are supported with the state of the art current SRAM technology have a green color background and the ones that are not supported have a red color background in Table I. The table states that in centralized approach, the increased camera resolution and output resolution will create memory bandwidth problems. Therefore, distributed and parallel approach will be Fig. 3. Discretized sphere surface with $N_{\theta}=16$ latitudes and $N_{\phi}=16$ longitudes (256 pixels). (a) Equiangular and (b) equal density pixelization. suitable to compete with increasing memory bandwidth and signal-processing frequency. #### C. Effects of Pixelization Schemes The pixel griding scheme for the omnivision application has an effect on the load imposed on each camera module of the Panoptic system when implemented distributively. The pixel directions $\vec{\omega}$ shown in Fig. 3(a) derive from an equiangular segmentation of longitude and latitude coordinates of a unit sphere into $N_{\phi}$ and $N_{\theta}$ segments, respectively. This pixelization enables the rectangular presentation of the reconstructed image suitable for ordinary displays but results in a nonequal contribution of the Panoptic cameras. The density of the pixel directions close to the poles of the sphere is higher compared with the equator of the sphere in the equiangular pixelization scheme. Hence, the cameras positioned closer to the poles of the sphere contribute to more pixels in comparison with the other cameras of the system. The equiangular pixelization derives mathematically from $$\phi_{\omega}(i) = \frac{2\pi}{N_{\phi}} \times i, \quad 0 \le i < N_{\phi}$$ $$\theta_{\omega}(j) = \frac{\pi}{2N_{\theta}} \times \left(j + \frac{1}{2}\right), \quad 0 \le j < N_{\theta}.$$ (3) The equiangular pixel griding scheme shown in Fig. 3(a) does not yield an equal number of $\vec{\omega}$ pixel directions for each camera to construct. A panoramic reconstruction of $N_{\theta} \times N_{\phi} = 256 \times 1024$ with equiangular griding can be seen in Fig. 4(a). For the nearest neighbor interpolation, the number of responsible $\vec{\omega}$ pixel directions per camera can be seen in Fig. 4(c). For the nearest neighbor interpolation of the distributed and parallel approach, computational load is not equally distributed among the camera modules. As shown in Fig. 4(c), the camera number one which is placed in the north pole of the system, is responsible for more than 10% of $\vec{\omega}$ pixel directions that need to be reconstructed. For the linear interpolation technique, the number of $\vec{\omega}$ pixel directions that each camera is contributing can be seen in Fig. 4(d). The workloads among the cameras are not distributed evenly, which is not suitable for implementation of the omnidirectional vision reconstruction algorithm in parallel. An equal density pixelization scheme shown in Fig. 3(b) resulting in an approximately even contribution of the cameras is devised for the Panoptic system. An example panoramic reconstruction of $N_{\theta} \times N_{\phi} = 256 \times 1024$ with equal density griding can be seen in Fig. 4(b). The scheme is based on enforcing a constant number of pixels per area, as expressed in (4). Compared with the equiangular pixelization, the change is observed in latitude angles $$\frac{N_{\phi} \times j}{\int_0^{2\pi} d\phi \int_0^{\theta_{\omega}(j)} \sin\theta \, d\theta} = \frac{N_{\phi} \times N_{\theta}}{2\pi}.\tag{4}$$ For the nearest neighbor interpolation, the number of responsible $\vec{\omega}$ pixel directions per camera in equal-density pixelization can be seen in Fig. 4(e). In contrast to Fig. 4(c), the number of $\vec{\omega}$ directions that each camera is responsible for are evenly distributed among the 49 cameras of the system. For the linear interpolation technique, the number of $\vec{\omega}$ directions that each camera is contributing can be seen in Fig. 4(f). The workload on Panoptic system is evenly distributed among the each camera compared to the equiangular pixelization, which makes equal-density pixelization method more suitable for parallel implementation. ### D. Intercamera Data Exchange In the centralized approach, pixel intensity values were saved in a single unit. Therefore, contributing pixel intensity values are easier to obtain while constructing pixel intensity values of a particular $\vec{\omega}$ direction. However, in the parallel and distributed approach, each camera is responsible for constructing its responsible part of omnidirectional panorama. Assuming that the individual cameras have local processing capability, the only missing variable to construct the light field is the light intensity values obtained by the other cameras. This creates the necessity of a communication scheme among the camera modules. #### III. INTERCONNECTED NETWORK OF CAMERAS A general purpose message-passing interconnection network is a programmable system capable of exchanging data between terminals. The system shown in Fig. 2 shows N terminals, $C_1$ through $C_N$ connected to a network. As an example, when terminal $C_2$ wishes to exchange data with terminal $C_5$ , $C_2$ sends a message containing the data to the network and the network delivers the message to $C_5$ . The terminals $C_i$ resemble the camera nodes with features in addition to basic imaging. Having a distributed camera system does not imply the omission of a central unit. For example, a central unit is required for the cameras to send their processed information for the purpose of display. It is preferred that all cameras also have a direct access to a central unit. However, this feature is not feasible or optimal in most cases. A central unit may not have enough ports to interface with all the cameras of the system. In case where all the cameras are connected to the central unit with distinct interfaces, and the respective bandwidth of these connections are not fully used, such an arrangement may cause inefficient usage of resources. Hence, it is more efficient to provide some of the cameras with direct accessing capability to the central unit and share these connections with the cameras that do not have a direct interface to the central unit. The availability of an Fig. 4. Seating place in the Swiss Federal Institute of Technology in Lausanne (EPFL, BC01). Panoramic construction with a pixel resolution of $N_{\phi} \times N_{\theta} = 1024 \times 256$ (a) using the equiangular pixelization, (b) using equal-density pixelization. (c)–(f) Number of pixels each camera responsible for nearest neighbor and linear interpolation methods, with different pixelization schemes. interconnection network permits the utilization of this strategy. The latter concept is depicted for the Panoptic system with N number of cameras as shown in Fig. 2. In multicamera applications, information exchange mostly takes part among the neighboring cameras. Thus, during the creation of an interconnection network, neighborhood relations Fig. 5. (a) Top view of the Voronoi diagram of a five-floor Panoptic system containing 49 camera locations. (b) Planar graph extracted from the Voronoi diagram. of camera modules should be preserved as much as possible. To obtain the neighborhood relation graph of the Panoptic system, the surface of the Panoptic device hemisphere is partitioned into a set of cells centered on the camera locations. Each cell is defined as the set of all points on the hemisphere which are closer to the camera location contained in the cell than to any other camera positions. The boundaries of the cells are determined by the points equidistant to two nearest sites, and the cell corners (or nodes) to at least three nearest sites. This particular partitioning falls into the category of a well-established geometry concept known as the Voronoi diagram (or Voronoi tessellation). The Voronoi diagram of a five floors and 49 cameras Panoptic system can be seen in Fig. 5(a). The geometrical neighborhood relation of five floors and 49 cameras extracted from the Voronoi diagram is shown in Fig. 5(b). #### A. Camera Assignment Problem The neighborhood relation graph is used for creating the interconnection medium among the cameras. However, in most of the systems, this irregular graph-based topology is hard to implement and control on hardware level. A regular graph-based topology can be used to simplify the implementation of the interconnection network. Instead of creating an irregular graph-based network shown in Fig. 5(b), a regular graph-based, $7 \times 7$ mesh topology is chosen to realize interconnected network of cameras. A regularized topology network is relatively simple to implement and control. It is scalable and easy to extend, add or subtract nodes. Flow control mechanisms and packet structures are easier to construct at the hardware level. Furthermore, it generalizes the problem regardless of the source network topology and the camera arrangement in the physical hemisphere dome. This assignment strategy is known as the context of a facility allocation problem called the quadratic assignment problem (QAP). The QAP models the following real-life problem: in a graph-based topology, for each pair of locations a distance is specified and for each pair of facilities a weight or flow (e.g., the amount of supplies transported between two facilities) is specified. The problem is to assign all facilities to different locations with the goal of minimizing the sum of the distances multiplied by the corresponding flows. Fig. 6. (a) Assigned $7 \times 7$ mesh topology interconnected network. (b) $7 \times 7$ mesh topology with seven vertex *p*-centers. A planar graph representing the neighboring of the cameras is extracted as shown in Fig. 5(b) where the nodes of the extracted graph represent the cameras and its edges correspond to the immediate neighborhood relations between cameras. Hence, in the latter graph two nodes are connected if their respective cameras are geometrical neighbors. The adjacency matrix of this graph can be used as the flow matrix of the QAP. The QAP is an NP-hard problem so there is no known algorithm for solving this problem in polynomial time, and even relatively small problem instances may require long computation time. There are several heuristic solution methods used for the assignment problem. Among different options, the sparse version of the greedy randomized adoptive search procedure algorithm [24] has given the best result solving the QAP in terms of minimizing the cost function. The assigned camera numbers as shown in Fig. 5(b) is represented on the mesh graph shown in Fig. 6(a). The assignment allocates the cameras such that all geometrical neighboring cameras are not more than three hops away from each other in the new topology. #### B. Central Unit Access To decide which cameras will have direct access to the central unit, the problem to solve is to decide which p candidate cameras shall be selected to have direct access to the central so that the rest of the cameras can access the central unit with minimum number of hops. This feature is desired for reducing access time between the central unit and any camera of the interconnected network, assuming sufficient channel bandwidth is available. The latter problem can also be mapped into the facility problem known as the vertex p-center problem. The basic p-center problem consists of locating p facilities and assigning clients to them so as to minimize the maximum distance between a client and the facility it is assigned to. This problem is also known to be NP-hard [25]. To distribute 49 cameras' load equally, the p value is chosen as seven. As an example, a vertex sevencenter problem has been solved for the mesh graph topology shown in Fig. 6(b) assuming that each facility can support up to seven clients. The problem is solved using an exact algorithm for the capacitated vertex p-center problem [26]. The solution is shown in Fig. 6(b). All the cameras acting as facility (i.e., with access to central unit) are shown with Fig. 7. Average packet latency ( $T_c$ ) versus average throughput ( $\lambda$ ) (i.e., packet injection rate) graphs. (a) Latency versus throughput for routers with different virtex p center numbers. (b) Latency versus throughput for routers for a 7 × 7 mesh network with QAP assigned camera locations comparing number of virtual channels. (c) Comparison in between QAP camera assignment versus random camera assignments. Several different random assignments have been conducted and the average latency values obtained through BookSim simulations. a bold edge. The cameras belonging to the same facility are also indicated with the same color. All cameras are at most two hops away from their supporting facility camera. This strategy aims to minimize the network load caused by the transmission of central unit access packets. #### C. Verification The designed interconnection network is simulated under real or close-to-real conditions. The BookSim simulator [27] is used for the purpose of performance analysis of the interconnection network of cameras. The BookSim simulator is a C++ based cycle-accurate interconnection network simulator. The simulator is extended to support custom-defined traffic patterns which are configured by a custom text file. This development was accomplished to support any traffic pattern for target networks under test. A MATLAB-based routine is developed to simulate different injection rates with several different test patterns. Optimal parameters for router unit such as number of virtual channels, buffer size, and so on, are extracted in terms of latency $(T_c)$ versus throughput $(\lambda)$ with custom created Panoptic traffic pattern. Injection rate is indicating how frequently a new packet is injected into network while latency indicates how many clock cycles it takes for a network packet to traverse to the destination node. All the injection rates are normalized to channel bandwidth and latency is expressed in number of cycles. Fig. 7(a) shows the latency versus injection rate for different numbers of vertex *p*-centers selected for direct access to the central unit. It is observed that for the nearest neighbor technique traffic pattern, the demands on the interconnection network tend to reduce as the numbers of vertex *p*-centers grows. As the number of vertex *p*-centers grows, the traffic becomes more balanced and localized. The $7 \times 7$ mesh network is also simulated under linear interpolation traffic pattern. The number of vertex *p*-center is chosen as seven. The assignment provided by the QAP approach and shown in Fig. 6(b) is used. Fig. 7(b) shows the latency versus throughput for different numbers of virtual channels. The results are given for throughput values of $\lambda < 0.4$ , as it is expected that the injection rate will not be higher than 0.4. For the purpose of comparison, a set of average packet latency versus average throughput graphs under linear interpolation traffic pattern for a $7 \times 7$ mesh network with random and QAP assigned camera locations is presented. Fig. 7(c) shows the latency versus throughput for comparison of randomly assigned camera locations to QAP assignment results. A knee point is observed in all the illustrated latency versus throughput graphs beyond which the rate of the packet latency increases faster with respect to the increase in packet injection rate. This knee point is observed at $\lambda = 0.2$ for the case of random assigned camera locations and one virtual channel, and at $\lambda \in [0.4, 0.5]$ for routers with more than one virtual channel. By contrast, the same knee points are observed at $\lambda = 0.4$ and $\lambda \in [0.6, 0.7]$ for the QAP assigned camera locations, respectively. Hence QAP-based assignment of the camera modules to the nodes in the same network topology provides a 50%-100% improvement in bandwidth utilization. The QAP assignment induces a localized traffic pattern and avoids longer packet hops throughout the network. Unlike the random assignment the QAP assignment also provides a more balanced load distribution for the channels of the interconnection network. The obtained results are useful for the real system implementation when the resource utilization and costs are of concern. Simulations prove that Panoptic traffic pattern can be implemented with expected injection rate and latency. Extracted parameters utilized during the implementation of the router mechanism in an FPGA platform. For the FPGA implementation, an open source Network-on-Chip Router in register transfer level provided by [27] is utilized. #### IV. PANOPTIC MEDIA PLATFORM A custom-made FPGA platform is designed for the implementation of the concept of an interconnected network of cameras. The developed platform is referred to as the Panoptic Media. A Panoptic system comprising 49 cameras is interfaced to this platform. The design and implementation of the parallel and distributed approach of the omnidirectional vision reconstruction algorithm of the Panoptic camera is elaborated for the Panoptic Media platform. The Panoptic Media Board (PMB) is an FPGA-based development platform with 14 layers of printed circuit board (PCB) designed and realized at EPFL. The PMB includes eight Xilinx XC5VLX110 Virtex5 FPGAs. One FPGA is targeted for the implementation of the central unit and the other seven are used for emulating an interconnected network of cameras. The FPGA hosting the central unit is referred to as the central/master FPGA and FPGAs hosting cameras is referred to as the slave FPGAs. Top view of the designed platform is shown in Fig. 1(b). #### A. Central FPGA The central FPGA interfaced with a 16 MB Flash memory which is intended for offline data storage and online data read access. Furthermore, it is interfaced with 32 MB SDRAM memory which is used for video memory purposes. To support the processing in the central FPGA, 22 MB ZBT-SRAM memories are also interfaced with the central FPGA. A USB-2.0 device is used as the USB device interface of the Panoptic Media. #### B. Slave FPGAs Each slave FPGA hosts seven parallel digital interfaces. The parallel digital interface comprises a Samtec high-speed flat flexible wire connector. Connectors are attachable to a camera module through a flat flexible wire connection. Each slave FPGA hosts seven asynchronous SRAM units each of which has a capacity of 2 MB and a minimum read/write access time of 100 ns. The ASRAM memories are targeted for frame buffering of video streams originating from the camera modules, and static and dynamic data storage. A Xilinx 32 MB Flash memory is attached to each FPGA for automatic programming of their respective FPGA after powering the board. #### C. Inter-FPGA Communication Each FPGA has 12 sets of 24-b bus connections. Each two set of 24-b bus connections are bundled to form a physical channel port for an FPGA. Each FPGA contains six physical channel ports. The direction of one bus connection is chosen as outward while the other one is selected as inward. However, the physical channel ports of the FPGAs can contain multiple logical channels. For the presented partitioning scheme of a $7 \times 7$ mesh interconnected network among the slave FPGAs of the PMB, it is sufficient to have a maximum of four logical channels within a physical channel port. Logical channels are realizable through time multiplexing while operating at higher frequency rates within a single physical channel. #### V. HARDWARE IMPLEMENTATION AND RESULTS #### A. Central FPGA Hardware Implementation The central FPGA hosts the central unit of the system. It is designed to be in charge of initialization, synchronization among the FPGAs and camera nodes, camera router nodes configuration and control, display, and external host communication of the system. Fig. 8. SoC architecture of the slave FPGAs. The central unit can communicate with all camera router nodes of the interconnection network through packet transmission and reception. Two types of packet exist in the system, named control, and data packets. Control packets are used for configuring camera router modules or monitoring and status check purposes. The data packets contain image information data which are used for display or for transfer to an external host. Each packet type and subtype is identified using a specific packet ID. Each data packet contains a pixel information of an image frame. The data packets can be sent by all the cameras simultaneously. Therefore, the pixels of an image are receivable in a shuffled order by the central unit. Hence, all the data packets pertaining to an image frame are temporarily stored by the RCTRL IP in the ZBT-SRAM first. The shuffled order of the receiving data packets implies a random write access nature to a memory. To this aim, the ZBT-SRAM is chosen for the temporary storage of the data packets pixel information part. When a full frame is received the Router Control IP transfers the received frame to the SDRAM. The SDRAM is used as the video memory for external display interfaces like monitors or projectors. #### B. Slave FPGA Hardware Implementation The role of a slave FPGA is to emulate a portion of a $7 \times 7$ mesh interconnected network of cameras. The system-on-a-chip (SoC) architecture of a slave FPGA is shown in Fig. 8. Each slave FPGA is responsible for seven imagers, hosts seven camera modules and seven ASRAM memories. The external physical channel port synchronization is conducted with the aid of the channel synchronization (CHSYNC) IP. The CHSYNC IP is similar to the one used in the central FPGA. Each imager is interfaced to a custom-designed smart camera IP (SCAM). The SCAM IP emulates a camera Fig. 9. Internal blocks of the smart camera IP used in the slave FPGAs. Fig. 10. Omnidirectional snapshot of resolution $1024 \times 768$ for the linear interpolation technique. module with router connectivity, memory, and application-processing units. The internal blocks of the custom-designed SCAM IP are shown in Fig. 9. Each SCAM IP interfaces with a custom external memory controller (CEMC). SCAM IPs have access to an ASRAM via its interfacing CEMC IP. The SCAM IP comprises of five sub-blocks. The Imager Interface sub-block is responsible for image acquisition and transfers the video stream generated by the imager to the ASRAM memory. The Application sub-block is designed to perform image processing applications. It is responsible for creating network demand packets for pixel values obtained by the other contributing cameras and performing first and second level of interpolations of the reconstruction algorithm. The Application sub-block communicates with the central unit and other SCAM IPs in the Panoptic system through the Router sub-block. The Router sub-block comprises of five ports (i.e., north, south, east, west, and an I/O to enter or flush out of the network ports). The main aim of the Router sub-blocks is to create the communication medium among the SCAMs. The Request Acknowledge sub-block responds to the incoming Fig. 11. Omnidirectional snapshot of resolution $1024 \times 256$ for the nearest neighbor interpolation technique plus VGA ( $640 \times 480$ ) display of a single selected camera. TABLE II CENTRAL FPGA DEVICE UTILIZATION SUMMARY | Resources | Used | Available | Utilization | |-----------------|-------|-----------|-------------| | Occupied Slices | 9463 | 17280 | 54% | | Slice Registers | 20629 | 69120 | 29% | | BlockRAM/FIFO | 93 | 128 | 72% | | DSP48Es | 3 | 64 | 4% | TABLE III SLAVE FPGAs DEVICE UTILIZATION SUMMARY FOR SLAVE FPGA 1 FOR THE NEAREST NEIGHBOR IMPLEMENTATION | Resources | Used | Available | Utilization | | | |--------------------|--------|-----------|-------------|--|--| | S1 Occupied Slices | 13,590 | 17280 | 78% | | | | S1 Slice Registers | 32,326 | 69120 | 47% | | | | S1 BlockRAM/FIFO | 89 | 128 | 69% | | | | S1 DSP48Es | 26 | 64 | 40% | | | demand packets from other SCam IPs. It creates respond packets that contain necessary intensity values and coefficients that are needed for the second level of interpolation. The Register Bank sub-block is for the IPs mode configuration, monitoring, and status checks. It can be reached by the central unit via interconnection network to perform overall control of the system. Forty-nine SCAMs distributed over seven FPGAs are working in parallel for omnidirectional vision reconstruction. Throughout the interconnection network, pixel intensity values are interchanged among the modules and each camera constructs its responsible portion of the omnidirectional vision. The central unit is responsible for obtaining all reconstructed pixels and displaying them. #### C. Implementation Results A Panoptic multicamera hemisphere of diameter 30 cm is built by stacking circular PCB rings on top of each other as $TABLE\ IV$ $SLAVE\ FPGAs\ DEVICE\ UTILIZATION\ SUMMARY\ FOR\ SLAVE\ FPGA\ 1$ $FOR\ THE\ LINEAR\ INTERPOLATION\ IMPLEMENTATION$ | Resources | Used | Available | Utilization | | | |--------------------|--------|-----------|-------------|--|--| | S1 Occupied Slices | 14,793 | 17280 | 85% | | | | S1 Slice Registers | 34,916 | 69120 | 50% | | | | S1 BlockRAM/FIFO | 89 | 128 | 72% | | | | S1 DSP48Es | 61 | 64 | 95% | | | shown in Fig. 1(a). Each circular PCB ring pertains to one floor of the Panoptic system with five floors. Forty-nine cameras are placed on circular PCBs. If the integrity check is passed, the central FPGAs processor triggers an interrupt event for all slave FPGAs. The slave FPGAs MicroBlaze processors start programming their interfaced camera modules simultaneously upon receiving the interrupt. The slave FPGAs MicroBlaze processors start initializing the ASRAM memories after programming the camera modules. Once the programming of the ASRAM memories completed, the system starts to display omnidirectional vision in real-time. The PMB system was found to support the real-time operation of a $7 \times 7$ interconnected network of cameras with a 25 frames/s frame rate, providing an omnidirectional application with a $N_{\theta} \times N_{\phi} = 1024 \times 728$ resolution and a linear interpolation method with $K(n=4;\vec{\omega})$ enforcement. An example screenshot from a video can be seen in Fig. 10. For a second operation mode in addition to the extended graphics array (XGA) output resolution, Panopticmedia is also shown to support a $256 \times 1024$ pixel output resolution by using nearest neighbor method. During the display of $256 \times 1024$ resolution omnidirectional view video, a chosen camera output in VGA resolution can also be displayed below the $360^{\circ}$ omnidirectional output. A screenshot from a video of latter can be seen in Fig. 11 and an example video can be found in [28]. The slave unit FPGAs are chosen to operate at 108 MHz clock frequency, whereas the central FPGA operates at 125 MHz. The 108 MHz clock was also chosen to support the maximum clock frequency (i.e., 27 MHz) of the camera module through a simple divide by four operation at the slave FPGAs. The 125 MHz clock frequency for central FPGA was chosen to support the nominal maximum bandwidth of the 1 Gb/s Ethernet link available on the PMB. Utilization summaries of central FPGA and slave FPGAs are given in the Tables II–IV, respectively. To satisfy real-time requirements of the system, the PMB Platform should construct an omnidirectional output in less than 40 ms, which corresponds to 25 frames/s. The maximum output resolution equals 1 M pixels, which is dictated by the reordering memory capacity. The next generation omnidirectional imaging platform that is currently under design will be capable of displaying 4 K videos with 30 frames/s. The average power consumption of the PMB board, when the FPGAs are in the omnidirectional vision reconstruction mode with XGA resolution, was measured as 67.2 W. #### VI. CONCLUSION In this paper, a novel parallel and distributed technique for the omnidirectional vision reconstruction of the Panoptic camera is presented, on the basis of an interconnected network of multiple image sensors. A robust methodology is shown for the assignment of individual cameras (imagers) onto regular network nodes, and for the selection of the cameras for central unit communication. A unique custom-made FPGA-based platform called the PMB was designed and realized for the emulation of a $7 \times 7$ mesh interconnected network of smart cameras. Moreover, the system-level design of the PMB was elaborated as well as the SoC architecture of the FPGAs of the PMB was presented. It is shown that the PMB prototype provides a real-time 25 frames/s omnidirectional video output in XGA resolution. The distributed and parallel implementation of multiband blending technique [29] is considered for the next real-time application deployment of the Panoptic device. Future work and application area of the PMB device is not limited to omnidirectional reconstruction, but the PMB prototype is also applicable to depth-map estimation, super-resolution, and multiview research topics. #### REFERENCES - [1] A. Kubota, A. Smolic, M. Magnor, M. Tanimoto, T. Chen, and C. Zhang, "Multiview imaging and 3DTV," *IEEE Signal Process. Mag.*, vol. 24, no. 6, pp. 10–21, Nov. 2007. - [2] R. Szeliski, "Image mosaicing for tele-reality applications," in *Proc. 2nd IEEE Workshop Appl. Comput. Vis.*, Dec. 1994, pp. 44–53. - [3] S. Mann and R. W. Picard, "Being 'undigital' with digital cameras: Extending dynamic range by combining differently exposed pictures," in *Proc. IS&T*, 1995, pp. 442–448. - [4] P. E. Debevec and J. Malik, "Recovering high dynamic range radiance maps from photographs," in *Proc. 24th Annu. Conf. Comput. Graph. Interactive Techn.*, New York, NY, USA, 1997, pp. 369–378. - [5] M. Tanimoto, M. P. Tehrani, T. Fujii, and T. Yendo, "Free-viewpoint TV," *IEEE Signal Process. Mag.*, vol. 28, no. 1, pp. 67–76, Jan. 2011. - [6] M. Levoy and P. Hanrahan, "Light field rendering," in Proc. SIGGRAPH 23rd Annu. Conf. Comput. Graph. Interactive Techn., 1996, pp. 31–42. - [7] Y. Y. Schechner and S. K. Nayar, "Generalized mosaicing," in Proc. 8th IEEE Int. Conf. Comput. Vis. (ICCV), vol. 1. Jul. 2001, pp. 17–24. - [8] D. Taylor, "Virtual camera movement: The way of the future?" Amer. Cinematographer, vol. 77, no. 9, pp. 93–100, 1996. - [9] P. Rander, P. J. Narayanan, and T. Kanade, "Virtualized reality: Constructing time-varying virtual worlds from real world events," in Proc. IEEE Vis., Oct. 1997, pp. 277–284. - [10] C. Zhang and T. Chen, "A self-reconfigurable camera array," in *Proc. Eurograph. Symp. Rendering*, 2004, pp. 243–254. - [11] J. C. Yang, M. Everett, C. Buehler, and L. McMillan, "A real-time distributed light field camera," in *Proc. 13th Eurograph. Workshop Rendering*, Aire-la-Ville, Switzerland, 2002, pp. 77–86. - [12] B. Wilburn et al., "High performance imaging using large camera arrays," ACM Trans. Graph., vol. 24, pp. 765–776, Jul. 2005. - [13] Y. Yagi, "Omnidirectional sensing and its applications," *IEICE Trans. Inf. Syst.*, vol. E82-D, no. 3, pp. 568–579, Mar. 1999. - [14] Y. Xu, Q. Zhou, L. Gong, M. Zhu, X. Ding, and R. K. F. Teng, "High-speed simultaneous image distortion correction transformations for a multicamera cylindrical panorama real-time video system using FPGA," *IEEE Trans. Circuits Syst. Video Technol.*, vol. 24, no. 6, pp. 1061–1069, Jun. 2013. - [15] Point Grey Research Inc. Ladybug. [Online]. Available http://ww2.ptgrey.comn, accessed Dec. 15, 2013. - [16] O. Schreer, I. Feldmann, C. Weissig, P. Kauff, and R. Schafer, "Ultrahigh-resolution panoramic imaging for format-agnostic video production," *Proc. IEEE*, vol. 101, no. 1, pp. 99–114, Jan. 2013. - [17] D. Anguelov *et al.*, "Google street view: Capturing the world at street level," *Computer*, vol. 43, no. 6, pp. 32–38, Jun. 2010. - [18] O. Cogal et al., "A new omni-directional multi-camera system for high resolution surveillance," Proc. SPIE Sens. Technol. Appl., vol. 9120, 2014. - [19] D. J. Brady et al., "Multiscale gigapixel photography," Nature, vol. 486, no. 7403, pp. 386–389, 2012. - [20] Y. M. Song et al., "Digital cameras with designs inspired by the arthropod eye," *Nature*, vol. 497, no. 7447, pp. 95–99, 2013. - [21] H. Afshari, L. Jacques, L. Bagnato, A. Schmid, P. Vandergheynst, and Y. Leblebici, "The PANOPTIC camera: A plenoptic sensor with realtime omnidirectional capability," J. Signal Process. Syst. Signal, Image, Video Technol., vol. 70, no. 3, pp. 305–328, 2013. - [22] H. Afshari, V. Popovic, T. Tasci, A. Schmid, and Y. Leblebici, "A spherical multi-camera system with real-time omnidirectional video acquisition capability," *IEEE Trans. Consum. Electron.*, vol. 58, no. 4, pp. 1110–1118, Nov. 2012. - [23] QDR Consortium. (Sep. 2013). Quad Data Rate SRAM. [Online]. Available: http://www.qdrconsortium.org/ - [24] Y. Li, P. M. Pardalos, and M. G. C. Resende, "A greedy randomized adaptive search procedure for the quadratic assignment problem," in Quadratic Assignment and Related Problems (DIMACS Series on Discrete Mathematics and Theoretical Computer Science), vol. 16. Providence, RI, USA: AMS, 1994, pp. 237–261. - [25] O. Kariv and S. L. Hakimi, "An algorithmic approach to network location problems. I: The p-centers," SIAM J. Appl. Math., vol. 37, no. 3, pp. 513–538, 1979. - [26] M. Ç. Pinar and F. A. Özsoy, "An exact algorithm for the capacitated vertex p-center problem," *Comput. Operat. Res.*, vol. 33, no. 5, pp. 1420–1436, 2006. [Online]. Available: http://dx.doi.org/10.1016/j.cor.2004.09.035 - [27] N. Jiang et al., "A detailed and flexible cycle-accurate network-onchip simulator," in Proc. IEEE Int. Symp. Perform. Anal. Syst. Softw., Apr. 2013, pp. 86–96. - [28] LSM. (May 2013). Real-Time Panoptic Video by EPFL-LSM. [Online]. Available: http://www.youtube.com/user/LSMPanoptic/videos - [29] M. Brown and D. G. Lowe, "Automatic panoramic image stitching using invariant features," *Int. J. Comput. Vis.*, vol. 74, no. 1, pp. 59–73, Aug. 2007. **Kerem Seyid** (S'13) received the B.S. degree in electronics engineering from Sabanci University, Istanbul, Turkey, in 2010 and the M.S. degree in electrical engineering from Swiss Federal Institute of Technology in Lausanne, Lausanne, Switzerland, in 2012, where he is currently working toward the Ph.D. degree. His research interests include real-time implementation of digital video and image processing systems and digital hardware design. Vladan Popovic (S'11) received the B.Sc. degree in electrical engineering from the School of Electrical Engineering, University of Belgrade, Belgrade, Serbia, in 2009 and the M.Sc. degree in electrical and electronic engineering from Swiss Federal Institute of Technology in Lausanne (EPFL), Lausanne, Switzerland, in 2011, where he is currently working toward the Ph.D. degree. He joined the Panoptic Camera Project with Microelectronic Systems Laboratory, EPFL, in 2010. Since 2011 he has been a Research Assistant with EPFL. His research interests include embedded systems design, real-time implementation of image processing algorithms, computational photography, and multiresolution image processing. Mr. Popovic received the Bob Owens Best Student Paper Award at the IEEE Workshop on Signal Processing Systems in 2012 and the Logitech Prize for the master's thesis for outstanding creativity, innovation, pragmatism and economic feasibility in 2011. Omer Cogal (S'03) received the B.S. degree in electronics engineering from Istanbul Technical University, Istanbul, Turkey, in 2005 and the M.S. degree from Boğaziçi University, Istanbul, in 2009. He is currently working toward the Ph.D. degree with Swiss Federal Institute of Technology in Lausanne, Lausanne, Switzerland. His research interests include embedded system design and digital hardware design. Abdulkadir Akin received the B.S. and M.S. degrees in electronics engineering from Sabanci University, Istanbul, Turkey, in 2008 and 2010, respectively. He is currently working toward the Ph.D. degree with Swiss Federal Institute of Technology in Lausanne, Lausanne, Switzerland. His research interests include digital hardware design for video processing and coding. **Hossein Afshari** (S'03) received the B.Sc. and M.Sc. degrees in electrical engineering with a major in communications and electronics from University of Tehran, Tehran, Iran, and the Ph.D. degree in electrical engineering from Swiss Federal Institute of Technology, Lausanne, Switzerland, in 2013. He was involved in the implementation of several industrial projects, such as a DRM transmitter, a baseband DVBT transmitter, a DVB-H application specified integrated circuit receiver, and an MPEG-TS multiplexer. His research interests include imple- mentation of digital transceivers, signal processing systems, image processing and vision reconstruction algorithms, and embedded systems. Alexandre Schmid (S'98–M'04) received the M.Sc. degree in microengineering and the Ph.D. degree in electrical engineering from Swiss Federal Institute of Technology in Lausanne (EPFL), Lausanne, Switzerland, in 1994 and 2000, respectively. He has been with EPFL since 1994, where he has been a Maître d'Enseignement et de Recherche Faculty Member since 2011, conducting research in bioelectronic interfaces, nonconventional signal processing and neuromorphic hardware, and reliability of nanoelectronic devices. He is a co-author and co-editor of two books and over 100 articles published in journals and conferences. Dr. Schmid was the General Chair of the Fourth International Conference on Nano-Networks in 2009 and has been an Associate Editor of *IEICE Electronics Express* since 2009. Yusuf Leblebici (M'90–SM'98–F'09) received the B.Sc. and M.Sc. degrees in electrical engineering from Istanbul Technical University, Istanbul, Turkey, in 1984 and 1986, respectively, and the Ph.D. degree in electrical and computer engineering from University of Illinois at Urbana-Champaign, Champaign, IL, USA, in 1990. He has been the Chair Professor with Swiss Federal Institute of Technology in Lausanne (EPFL), Lausanne, Switzerland, and the Director of Microelectronic Systems Laboratory since 2002. He is the co-author of six textbooks and has authored over 300 articles in various journals and conferences. His research interests include the design of high-speed CMOS digital and mixed-signal integrated circuits, computer-aided design of very-large-scale integration (VLSI) systems, intelligent sensor interfaces, modeling and simulation of semiconductor devices, and VLSI reliability analysis. Dr. Leblebici was an Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM II and IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATED SYSTEMS. He has also served as the General Co-Chair of the 2006 European Solid-State Circuits Conference and the 2006 European Solid-State Device Research Conference. He was elected as a Distinguished Lecturer of the IEEE Circuits and Systems Society from 2010 to 2011.