Parallel approximation for partial set cover
Introduction
In the minimum set cover problem (MinSC), given a set of elements, a collection of sets with , a cost function , the goal of MinSC is to find a minimum cost sub-collection such that all elements are covered, i.e, . MinSC is a classic combinatorial optimization problem which has a lot of applications in the real world, plays an important role in the field of computational complexity theory and approximation algorithms [27].
In real world applications, it is often too costly to satisfy all covering requirements. Such a consideration leads to the minimum partial set cover problem (MinPSC). Besides , an integer is given, the goal of MinPSC is to find a minimum cost sub-collection of that covers at least elements of . Partial cover attracts a lot of studies in the field of data science and facility location in terms of “outlier” [8], [21].
Note that most existing algorithms for partial cover are centralized and sequential. But in this era of big data, parallel algorithms are more welcoming. Although there are a lot of studies on parallel algorithms for MinSC, parallel studies on its partial version MinPSC are rare. Gandhi designed a parallel algorithm for MinPSC in [9] which achieves approximation ratio in rounds, where is the maximum frequency of elements, i.e., the maximum number of sets containing a common element. Note that could be as large as , so the running time of Gandhi’s algorithm might not be satisfactory. Such an observation motivates us to study NC parallel algorithm for MinPSC with running time independent of .
Note that although the approximation ratio is tight under the unique games conjecture [14], it can be large in a general setting. A question is: is it possible to obtain better approximation in some special setting? For example, considering coverage problems with some geometric properties? This question motivates us to study parallel algorithm for the minimum power partial cover problem (MinPPC). Suppose is a set of points and is a set of sensors on the plane, each sensor can adjust its power, the covering range of a sensor with power is a disk centered at which has radius satisfyingwhere are constants, and is called the attenuation factor of power. We use and to denote the center and the radius of disk , respectively. So, represents the disk with center and radius . Given an integer , the MinPPC problem is to determine the power assignment on each sensor such that at least points are covered and the total power consumption is the minimum. This problem is motivated by the intention to extend the lifetime of wireless sensor networks under limited energy supply [18].
The MinPPC problem can be viewed as a special case of the MinPSC problem. Note that in a MinPPC instance, an optimal power assigment must be such that for each sensor with positive power, there is a point on the boundary of its covering range. For a sensor and a point , we call disk as a canonical disk, and set its cost to be , and is the Euclidean distance between and . Denote the collection of canonical disks as , and view each disk as a subset of consisting of those points covered by . Then, a MinPPC instance can be transformed into a MinPSC instance . Since each sensor can be assigned only one power, there is an extra constraint for the transformed instance: among those disks corresponding to a same sensor, at most one disk can be chosen. An optimal solution to the transformed MinPSC instance trivially satisfies such an extra constraint: if two disks corresponding to a same sensor coexist in a solution, then removing the one with smaller radius keeps the feasibility and reduces the power. So the MinPPC problem is a special case of the MinPSC problem.
For the MinPSC instance obtained by the above reduction, the maximum frequency is equal to the number of sensors, which is too large to be a good approximation factor. In [18], a -approximation algorithm was obtained for MinPPC using local ratio technique. This algorithm is centralized. How to parallelize it needs new insight.
Kearns [13] was the first to study the MinPSC problem and presented a greedy algorithm with approximation ratio , where is the th Harmonic number and . Later, Slavik [26] gave an improved greedy algorithm with approximation ratio . Bar-Yehuda [1] obtained an -approximation using local ratio method. Using a primal-dual method, Gandhi et al. [9] also obtained -approximation. Könemann et al. [16] obtained a -approximation for a generalized version of MinPSC, in which each element has a profit, and the goal is to select a minimum cost subcollection of sets such that the total profit of covered elements is beyond some threshold. Inamdar and Varadarajan [11] showed that a feasible solution to the standard linear program for set cover can be rounded to a -approximation for MinPSC, where is the integrality gap for the set cover LP. These are centralized algorithms. To the best of our knowledge, there is only one paper studying parallel algorithm for MinPSC [9], in which Gandhi et al. presented a parallel algorithm with approximation ratio in rounds.
Compared with rare studies on parallel MinPSC algorithm, there are a lot of works studying parallel algorithms or distributed algorithms for MinSC. Berger et al. [3] provided the first parallel algorithm using a bucketing approach, obtaining a -approximation in rounds, where is the sum of sizes of the sets. Rajagopalan and Vazirani [25] improved the number of rounds to at the cost of a larger approximation ratio . Blelloch et al. [7] further improved this result to a -approximation in rounds. Khuller et al. [15] presented a parallel -approximation algorithm in rounds, where the maximum frequency is assumed to be a constant. Koufogiannakis and Young [17] and Harvey et al. [10] independently designed distributed -approximation algorithms in polylogarithmic communication rounds. Bar-Yehuda et al. [2] used a distributed local ratio method to design a -approximation algorithm for the minimum-weight vertex cover problem in communication rounds, where is the maximum degree of the graph and is a constant.
The MinPSC problem is a special case of the minimum submodular cover problem (MinSMC). Given a monotone nondecreasing submodular function , an integer , the goal of MinSMC is to find a subset with the minimum cost such that . MinPSC is a special MinSMC since for any , the function is monotone nondecreasing and submodular. For MinSMC, a greedy algorithm [28] can achieve approximation ratio with . Distributed algorithms for the cardinality version of the MinSMC problem emerge recently. Mirzasoleiman et.al [23] proposed a distributed algorithm which yields a solution of size in communication rounds, where denotes the number of machines, is a constant, and is an optimal solution. Afterwards, Mirzasoleiman et.al [24] proposed a faster distributed algorithm with size at most in at most rounds. Note that these algorithms are distributed, and their approximation ratios are measured in terms of . While in this paper, we aim at a parallel algorithm which can be implemented in an NC model, whose approximation ratio is measured in terms of .
For the minimum power (full) cover problem (MinPC), Biló et al. [6] presented a PTAS. There are a lot of studies on the minimum power multi-cover problem (MinPMC), in which points are required to be covered multiple times, constant approximation ratios were obtained [4], [5]. These are works on the full cover version. For the partial version of the minimum power cover problem MinPPC, studies are rare. Li et al. presented a -approximation algorithm for MinPPC using local ratio technique [18] and primal dual method [19], respectively. Liang et al. [20] studied the minimum power partial multi-cover problem on a line (MinPPMC-Line) and presented a polynomial-time exact algorithm when the maximum covering requirement is a constant. As far as we know, there is no study of parallel algorithm for the minimum power partial cover problem.
In this paper, we first design a parallel algorithm for MinPSC, which achieves approximation ratio at most in rounds, where is a constant.
Compared with the parallel algorithm in [9], which has approximation ratio , the number of rounds is . Since can be as large as , the running time of [9] might not be logarithmic in the input size. Our algorithm has the advantage that its running time is logarithm of the input size which is independent of .
The method used in our algorithm is inspired by the sequential local-ratio algorithm in [1]. To parallelize this sequential algorithm, we decompose the cost function into a series of cost functions depending on a varying parameter . The key trick is to let increase as a geometric progression, so that the decrease of cost can be fast, and thus the number of rounds could be controlled within a logarithm. In the sequential local ratio algorithm of [1], a set is chosen into the solution when its cost is decreased to zero. But using the above trick, a cost might not be decreased to zero. Our strategy is to select into the solution as soon as its cost is less than . This is where comes into the approximation ratio and the running time.
Then we design a parallel algorithm for MinPPC with approximation ratio in rounds, where is the attenuation factor of power. Geometric property plays a crucial role in the analysis. This is the first parallel algorithm for MinPPC.
Section snippets
Parallel algorithm for MinPSC
Denote by a MinPSC instance. For any sub-collection of sets , denote by the set of elements covered by . The algorithm will guess the largest cost of a set in , where is an optimal solution. Suppose is the guessed set. The residual instance with respect to is , where , , for and . The algorithm is executed parallelly on machines, where . Each machine
Parallel algorithm for MinPPC
In this section, we present a parallel algorithm for the minimum power partial cover problem (MinPPC). In [18], Li et al. proposed a -approximation algorithm for MinPPC. They first guess the largest cost in . Then for the residual instance, a feasible solution is obtained using local ratio method. A maximal independent set is found in by a specific rule, and expanding disks in this maximal independent set yields a feasible solution. The main purpose of this section is to parallelize
Conclusion
In this paper, we design a parallel algorithm for MinPSC to obtain a solution with approximation ratio at most in rounds, where is an arbitrarily constant. For the minimum power partial cover problem (MinPPC), we design a parallel algorithm with approximation ratio in rounds, where is the attenuation factor of power. How to obtain a parallel algorithm with approximation ratio exactly might be an interesting topic.
Note that our method
Acknowledgments
This research work is supported in part by NSFC (11901533, U20A2068, 11771013), and ZJNSFC (LD19A010001).
References (28)
Using homogeneous weights for approximating the partial cover problem
J. Algorithms
(2001)- et al.
Efficient nc algorithms for set cover with applications to learning and geometry
J. Comput. Syst. Sci.
(1994) - et al.
Approximation algorithms for partial covering problems
J. Algorithms
(2004) - et al.
Vertex cover might be hard to approximate to within
J. Comput. Syst. Sci.
(2008) - et al.
A primal-dual parallel approximation technique applied to weighted set and vertex covers
J. Algorithms
(1994) - et al.
Minimum power partial multi-cover on a line
Theor. Comput. Sci.
(2021) Improved performance of the greedy algorithm for partial cover
Inf. Process. Lett.
(1997)- et al.
A distributed -approximation for vertex cover in rounds
CoRR
(2016) - et al.
A constant-factor approximation for multi-covering with disks
Symposium on Computational Geometry
(2013) - et al.
On metric multi-covering problems
Comput. Geom.
(2017)
Geometric clustering to minimize the sum of cluster sizes
ESA
Linear-work greedy parallel approximate set cover and variants
SPAA
Algorithms for facility location with outliers
SODA
Greedy and local ratio algorithms in the mapreduce model
CoRR
Cited by (3)
Parallel algorithms for minimum general partial dominating set and maximum budgeted dominating set in unit disk graph
2022, Theoretical Computer ScienceParallel Algorithm for Minimum Partial Dominating Set in Unit Disk Graph
2021, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)