Elsevier

Information Sciences

Volume 238, 20 July 2013, Pages 205-211
Information Sciences

Finding the minimum number of elements with sum above a threshold

https://doi.org/10.1016/j.ins.2013.02.035Get rights and content

Abstract

Motivated by the wavelet compression techniques and their applications, we consider the following problem: Given an unsorted array of numerical values and a threshold, what is the minimum number of elements chosen from the array, such that the sum of these elements is not less than the threshold value. In this article, we first provide two linear time algorithms for the problem. We then demonstrate the efficacy of these algorithms through experiments. Lastly, as an application of this research, we indicate that the construction of wavelet synopses on a prescribed error bound (in L2 metric) can be solved in linear time.

Introduction

Given an unsorted list of numerical values A with n elements and a threshold ρ > 0, the error-bound sum selection (SelectSumρ) problem1 termed in this article is to find S  A of minimized cardinality such that sSsρ. That is, acquiring the top most valued elements in an array with the sum equal to or greater than the threshold value. One obvious solution for this problem is to sort the sequence into decreasing numerical order and then output the current largest elements one by one until their sum is equal to or greater than the threshold value. Clearly, this will take O(nlog n) time in the worst case. As the array A becomes very large, this naive approach can be inefficient in time and expensive for online stream data processing.

The SelectSumρ problem is slightly related to the Selection problem. Solved by Blum et al. [2] in 1972, the Selection problem aims at finding the kth largest element in an unsorted array. Using the divide-and-conquer technique, Blum et al. derived a linear time algorithm2 through wisely selecting pivot elements for partitions. This was an important complexity result because till then the selection problem was assumed to be as difficult as sorting. A lower bound of 2n comparisons (in the worst-case) for the median selection problem was due to Bent and John [1] in 1985. Refer to [3], [4] for the detailed research on the Selection problem.

The SelectSumρ Problem is a fundamental problem for many practical applications. In Section 4, we will use an example to illustrate its application on streaming data compression.

In this article, we explore the SelectSumρ problem and provide linear time algorithms for the problem. We first indicate that the SelectSumρ problem is solvable in linear time by using the Selection algorithm in a straight way. We then provide a more involved algorithm for the problem. With this article’s result, we conclude that the error-bound wavelet compression on square error (L2) can be computed in linear time.

The rest of the paper is organized as follows. Section 2 presents the relevant concepts and the two algorithms for the SelectSumρ problem. Section 3 reports the experiment results of these proposed algorithms. Section 4 is an application of the SelectSumρ problem on streaming data compression and Section 5 concludes this article.

Section snippets

Algorithms on SelectSumρ

In this section, we introduce two new algorithms, SelectSum1 and SelectSum2, both used to solve the SelectSumρ problem. Both algorithms are based on the divide-and-conquer technique and recursively using the Selection algorithm. To simplify the study of this problem, we have the following notations and assumptions.

Let A = [a1, a2,  , an] be an array with cardinality ∣A = n where the ith element ai is denoted as A[i]. Clearly, the total sum of A,t=i=1nA[i], is derivable by one linear scanning of A.

Experiments

In this section, we exam the time efficiency for the proposed algorithms. We compare the results to those achieved with the SelectSumNaive (A, ρ) algorithm, which is implemented by firstly sorting the sequence into decreasing order and then outputting the current largest elements one by one until their sum is equal or above the given threshold ρ.

All the algorithms are implemented in Matlab and all the experiments are performed on an Intel i7 2.0 GHZ Mac OS 10.7 box with 8 GB memory. The

Motivating example

In this section, we will briefly illustrate an application of the SelectSumρ problem on data compression.

Widely used in signal and image processing [6], the (Haar) wavelet technique has been extensively investigated [12] for data indexing, query optimization and approximation. More recently, this technique has been used to construct error-guaranteed data synopses for streaming data processing [5], [7], [8], [9], [13], which are mostly to minimize maximum error (L).

The basic idea of

Conclusions

In this article, we proposed two linear time algorithms for the SelectSumρ problem by using idea from [2] on the kth Selection Problem. As a fundamental problem, the SelectSumρ problem has many real applications. Besides the application on data representation and compression as mentioned in this article, the SelectSumρ problem can be used to select the least number of dominating points that satisfy certain conditions [10]. Our future work will consider how the SelectSum2 algorithm can be

Acknowledgments

The authors thank the reviewers for their comments on a draft of this article. The work reported in this article is partially supported by an ARC research Grant (DP130103051), Ningbo Natural Science Foundation (Nos. 2012A610025, 2012A610060).

References (13)

There are more references available in the full text version of this article.

Cited by (3)

  • Optimum nonnegative integer bit allocation for wavelet based signal compression and coding

    2015, Information Sciences
    Citation Excerpt :

    Some other ONIBA algorithms are proposed in [4,20,21], which are recommended for vector quantization, but have more computational complexity compared to the mentioned algorithms. Wavelet transform is widely used for compression and coding of any type of signal, including speech, image and video [15,19,23]. In a common data compression approach, wavelet transform is applied to the signal and the low energy subbands of the signal are discarded and only the high-energy subbands are kept.

View full text