A new treecode for long-range force calculation
Introduction
For long-range force calculation, all treecodes use the idea that the effect of a cluster of particles at a distant point can be approximated by a small number of initial terms of the power series. This method builds the tree data structure by dividing particles using their positional information and makes a linked list from each cell's information, so that calculation can be simplified by regarding a cluster of particles far away from a test particle as a single particle. This method is a very powerful tool that can reduce the calculation cost from to . Since the octtree was created by Barnes and Hut [1], similar methods have also been developed. The octtree algorithm is simple but also fast and efficient [2], [3].
The binary tree is more adaptable to various particle distributions than the octtree is, but it incurs a higher computing cost, because it requires a greater number of nodes and greater tree depth. Using the nearest neighbor algorithm, Jernigan et al. [4] and Benz et al. [5] proposed a binary tree that creates a node by binding a particle to its nearest neighbor using a bottom-up method. This technique is independent of coordinate system and is highly adaptive to particle distributions. Makino compared this algorithm to the octtree, but the effect of the adaptivity could not be determined [2].
Waltz et al. proposed another treecode, a ‘binary tree with a level skipping method’ [3], which adapts well to particle distributions and also has a similar branching ratio to that of the octtree. In spite of these advantages, they found that the performance of the long-range force calculation was similar to that of octtree, and that the performance of the short range force calculation was slightly better than that of octtree.
The N-body simulation typically includes two classes of the physical phenomena [3]. One is under the effect of long-range forces, and the other is under the effect of short range forces. The object of the long-range force calculation using a treecode is to minimize the number of interactions with a controllable amount of error. For short range force calculations, the object is to distinguish particles within a given distance from a test particle as quickly as possible. Both can be effectively simulated by the tree algorithm with superior efficiency.
To transform a given particle system into a hierarchical tree data structure, the octtree divides a cube that contains the entire particle system into eight sub-cells with equal volume. This procedure is repeated recursively until there is either one particle or none in the divided sub-cells [1]. The octtree is coordinate dependent, because the plane of the cubic sub-cells is parallel to the axis of the coordinate system.
We propose a new treecode that can adapt its division to various particle distributions. Using a top–down nature, this method subdivides parent cell into k sub-cells. The tree structure created by this method is coordinate-independent and adaptive, because the cell centers of the subdivided cells are placed where the particles are concentrated. Since a constant k value is used in the subdivision process, we can estimate that the octtree is similar to the case of and the binary tree to . We can thus investigate how the branching ratio affects performance of the treecode. The tree structure proposed here will be called the ‘k-tree’ hereafter.
In this paper, we will discuss only the long-range force calculation, concentrating on gravitational problems. The remainder of this paper is organized as follows. Section 2 explains the construction algorithms of the two tree data structures: the octtree and the k-tree algorithm proposed here. In Section 3, we describe performance metrics, the calculation environment, and parallelization. Section 4 reports the performance comparison according to k and the result of the direct comparison with the octtree.
Section snippets
Tree algorithm
The octtree developed by Barnes and Hut has achieved good performance for long-range force calculations. Considering a cube that encloses all the particles as a root, this algorithm recursively subdivides the cube into eight sub-cubes of equal volume. After repeating this partitioning procedure until each cell contains zero or one particle, a well-arranged tree data structure is created. This simple idea allows the octtree to construct the tree data structure at surprising speed, as well as to
Methods
There are some difficulties in measuring and comparing the performance of the tree algorithms for the N-body problem. One is that the measured performance is highly problem-dependent, and another is that the performance must be compared at the same error level. As in the previous work by Waltz et al., at a given error level we counted two kinds of the intrinsic operations during tree traversal. One is the number of node evaluations (), distance calculations from a particle to a cell and
Results
We selected the Plummer model for particle distribution. In this model, which is used for astrophysical applications [2], the particles are highly concentrated at the origin. The treecodes are tested with problem size N, ranging from 320,000 to 2,560,000. But the problem-size effect is insignificant on relative differences in performance.
Fig. 4 shows the number of nodes of the k-tree and the time spent on tree construction as k ranged from 2 to 8. Because the k-tree algorithm should converge
Conclusion
We proposed a new tree algorithm with branching ratio k using a k-means algorithm at every subdivision. In this algorithm, which adapts well to various particle distributions, subdivided cells are coordinate-independent. This method also includes a procedure for determining cell sizes different from that of other tree algorithms. Independently of the particle distribution, as k increased the number of nodes decreased slightly, but the time to construct the tree structure increased.
The number of
References (10)
- et al.
J. Comput. Phys.
(2002) - et al.
J. Comput. Phys.
(1994) - et al.
J. Comp. Phys.
(1993) - et al.
Nature
(1986) J. Comput. Phys.
(1990)
Cited by (6)
Hierarchical boundary element method based on the Barnes-Hut tree applied to exterior creeping flow
2014, Revista Internacional de Metodos Numericos para Calculo y Diseno en IngenieriaTowards a petascale tree code: Scaling and efficiency of the PEPC library
2011, Journal of Computational ScienceCitation Excerpt :With this approach the current version of PEPC is capable of rapid simulations of 4 × 108 particles. Further tree code approaches can be found e.g. in [2,4,11,13]: their analyses vary from 1 to 128 processors, simulating up to 107 particles with reasonably good scaling. In [20] an implementation of a Fast Multipole Method (FMM) – including node-node interaction in contrast to the Barnes–Hut approach of the present work – is shown, with very promising scaling results on 65,536 cores.
Simulation of the Interaction of Oppositely Directed Particle Flows
2020, Computational Mathematics and Mathematical PhysicsTree-based solvers for adaptive mesh refinement code FLASH - I: Gravity and optical depths
2018, Monthly Notices of the Royal Astronomical SocietyComplex systems in cosmology: "the Antennae" case study
2009, Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering