A new treecode for long-range force calculation

https://doi.org/10.1016/j.cpc.2007.08.006Get rights and content

Abstract

We propose a new hierarchical tree algorithm with high adaptivity to various particle distributions for long-range force calculations. This algorithm divides parent cells into k daughter cells using the k-means algorithm. The tree structure provided by this algorithm is independent of the coordinate system used. This method also includes a unique procedure for determining cell sizes adjusted to particle distributions.

We investigated the characteristics of the tree structure and the effect on the long-range force calculation performance of various branching ratios k. The results of numerical experiments using various particle distributions showed that the number of interactions between particles and cells grows with k, but the number of distance evaluations between particles and cells decreased when k is around 5. We can therefore select an optimized value of k according to the characteristics of the problem to be analyzed. Comparing the algorithm to Barnes–Hut treecode using gravitational calculations at the same error level, we found that the calculation cost could be decreased remarkably.

Introduction

For long-range force calculation, all treecodes use the idea that the effect of a cluster of particles at a distant point can be approximated by a small number of initial terms of the power series. This method builds the tree data structure by dividing particles using their positional information and makes a linked list from each cell's information, so that calculation can be simplified by regarding a cluster of particles far away from a test particle as a single particle. This method is a very powerful tool that can reduce the calculation cost from O(N2) to O(NlogN). Since the octtree was created by Barnes and Hut [1], similar methods have also been developed. The octtree algorithm is simple but also fast and efficient [2], [3].

The binary tree is more adaptable to various particle distributions than the octtree is, but it incurs a higher computing cost, because it requires a greater number of nodes and greater tree depth. Using the nearest neighbor algorithm, Jernigan et al. [4] and Benz et al. [5] proposed a binary tree that creates a node by binding a particle to its nearest neighbor using a bottom-up method. This technique is independent of coordinate system and is highly adaptive to particle distributions. Makino compared this algorithm to the octtree, but the effect of the adaptivity could not be determined [2].

Waltz et al. proposed another treecode, a ‘binary tree with a level skipping method’ [3], which adapts well to particle distributions and also has a similar branching ratio to that of the octtree. In spite of these advantages, they found that the performance of the long-range force calculation was similar to that of octtree, and that the performance of the short range force calculation was slightly better than that of octtree.

The N-body simulation typically includes two classes of the physical phenomena [3]. One is under the effect of long-range forces, and the other is under the effect of short range forces. The object of the long-range force calculation using a treecode is to minimize the number of interactions with a controllable amount of error. For short range force calculations, the object is to distinguish particles within a given distance from a test particle as quickly as possible. Both can be effectively simulated by the tree algorithm with superior efficiency.

To transform a given particle system into a hierarchical tree data structure, the octtree divides a cube that contains the entire particle system into eight sub-cells with equal volume. This procedure is repeated recursively until there is either one particle or none in the divided sub-cells [1]. The octtree is coordinate dependent, because the plane of the cubic sub-cells is parallel to the axis of the coordinate system.

We propose a new treecode that can adapt its division to various particle distributions. Using a top–down nature, this method subdivides parent cell into k sub-cells. The tree structure created by this method is coordinate-independent and adaptive, because the cell centers of the subdivided cells are placed where the particles are concentrated. Since a constant k value is used in the subdivision process, we can estimate that the octtree is similar to the case of k=8 and the binary tree to k=2. We can thus investigate how the branching ratio affects performance of the treecode. The tree structure proposed here will be called the ‘k-tree’ hereafter.

In this paper, we will discuss only the long-range force calculation, concentrating on gravitational problems. The remainder of this paper is organized as follows. Section 2 explains the construction algorithms of the two tree data structures: the octtree and the k-tree algorithm proposed here. In Section 3, we describe performance metrics, the calculation environment, and parallelization. Section 4 reports the performance comparison according to k and the result of the direct comparison with the octtree.

Section snippets

Tree algorithm

The octtree developed by Barnes and Hut has achieved good performance for long-range force calculations. Considering a cube that encloses all the particles as a root, this algorithm recursively subdivides the cube into eight sub-cubes of equal volume. After repeating this partitioning procedure until each cell contains zero or one particle, a well-arranged tree data structure is created. This simple idea allows the octtree to construct the tree data structure at surprising speed, as well as to

Methods

There are some difficulties in measuring and comparing the performance of the tree algorithms for the N-body problem. One is that the measured performance is highly problem-dependent, and another is that the performance must be compared at the same error level. As in the previous work by Waltz et al., at a given error level we counted two kinds of the intrinsic operations during tree traversal. One is the number of node evaluations (NE), distance calculations from a particle to a cell and

Results

We selected the Plummer model for particle distribution. In this model, which is used for astrophysical applications [2], the particles are highly concentrated at the origin. The treecodes are tested with problem size N, ranging from 320,000 to 2,560,000. But the problem-size effect is insignificant on relative differences in performance.

Fig. 4 shows the number of nodes of the k-tree and the time spent on tree construction as k ranged from 2 to 8. Because the k-tree algorithm should converge

Conclusion

We proposed a new tree algorithm with branching ratio k using a k-means algorithm at every subdivision. In this algorithm, which adapts well to various particle distributions, subdivided cells are coordinate-independent. This method also includes a procedure for determining cell sizes different from that of other tree algorithms. Independently of the particle distribution, as k increased the number of nodes decreased slightly, but the time to construct the tree structure increased.

The number of

References (10)

  • J. Waltz et al.

    J. Comput. Phys.

    (2002)
  • J.K. Salmon et al.

    J. Comput. Phys.

    (1994)
  • G.S. Winckelmans et al.

    J. Comp. Phys.

    (1993)
  • J. Barnes et al.

    Nature

    (1986)
  • J. Makino

    J. Comput. Phys.

    (1990)
There are more references available in the full text version of this article.

Cited by (6)

  • Hierarchical boundary element method based on the Barnes-Hut tree applied to exterior creeping flow

    2014, Revista Internacional de Metodos Numericos para Calculo y Diseno en Ingenieria
  • Towards a petascale tree code: Scaling and efficiency of the PEPC library

    2011, Journal of Computational Science
    Citation Excerpt :

    With this approach the current version of PEPC is capable of rapid simulations of 4 × 108 particles. Further tree code approaches can be found e.g. in [2,4,11,13]: their analyses vary from 1 to 128 processors, simulating up to 107 particles with reasonably good scaling. In [20] an implementation of a Fast Multipole Method (FMM) – including node-node interaction in contrast to the Barnes–Hut approach of the present work – is shown, with very promising scaling results on 65,536 cores.

  • Simulation of the Interaction of Oppositely Directed Particle Flows

    2020, Computational Mathematics and Mathematical Physics
  • Complex systems in cosmology: "the Antennae" case study

    2009, Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering
View full text