Abstract
The Matroid Secretary Conjecture is a notorious open problem in online optimization. It claims the existence of an O(1)-competitive algorithm for the Matroid Secretary Problem (MSP). Here, the elements of a weighted matroid appear one-by-one, revealing their weight at appearance, and the task is to select elements online with the goal to get an independent set of largest possible weight. O(1)-competitive MSP algorithms have so far only been obtained for restricted matroid classes and for MSP variations, including Random-Assignment MSP (RA-MSP), where an adversary fixes a number of weights equal to the ground set size of the matroid, which then get assigned randomly to the elements of the ground set. Unfortunately, these approaches heavily rely on knowing the full matroid upfront. This is an arguably undesirable requirement, and there are good reasons to believe that an approach towards resolving the MSP Conjecture should not rely on it. Thus, both Soto (SIAM Journal on Computing 42(1): 178-211, 2013.) and Oveis Gharan and Vondrák (Algorithmica 67(4): 472-497, 2013.) raised as an open question whether RA-MSP admits an O(1)-competitive algorithm even without knowing the matroid upfront. In this work, we answer this question affirmatively. Our result makes RA-MSP the first well-known MSP variant with an O(1)-competitive algorithm that does not need to know the underlying matroid upfront and without any restriction on the underlying matroid. Our approach is based on first approximately learning the rank-density curve of the matroid, which we then exploit algorithmically.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The Matroid Secretary Problem (MSP), introduced by Babaioff et al. [1], is a natural and well-known generalization of the classical Secretary Problem [2], motivated by strong connections and applications in mechanism design. Formally, MSP is an online selection problem where we are given a matroid \(\mathcal {M}= (N, \mathcal {I})\),Footnote 1 with elements of unknown weights \(w :N \rightarrow \mathbb {R}_{\ge 0}\) that appear one-by-one in uniformly random order. Whenever an element appears, it reveals its weight and one has to immediately and irrevocably decide whether to select it. The goal is to select a set of elements \(I \subseteq N\) that (i) is independent, i.e., \(I \in \mathcal {I}\), and (ii) has weight \(w(I) = \sum _{e \in I} w(e)\) as large as possible. The key challenge in the area is to settle the notorious Matroid Secretary Problem (MSP) Conjecture:
Conjecture 1.1
([1]) There is an O(1)-competitive algorithm for MSP.
The best-known procedures for MSP are \(O(\log \log ({{\,\textrm{rank}\,}}(\mathcal {M})))\)-competitive [3, 4], where \({{\,\textrm{rank}\,}}(\mathcal {M})\) is the rank of the matroid \(\mathcal {M}\), i.e., the cardinality of a largest independent set.
Whereas the MSP Conjecture remains open, extensive work in the field has led to constant-competitive algorithms for variants of the problem and restricted settings. This includes constant-competitive algorithms for specific classes of matroids [5,6,7,8,9,10,11,12,13]. Moreover, in terms of natural variations of the problem, Soto [8] showed that constant-competitiveness is achievable in the so-called Random-Assignment MSP, or RA-MSP for short. Here, an adversary chooses \(|N|\) weights, which are then assigned uniformly at random to the ground set elements N of the matroid. (Soto’s result was later extended by Oveis Gharan and Vondrák [14] to the setting where the arrival order of the elements is adversarial instead of uniformly random.) Constant-competitive algorithms also exist for the Free Order Model, where the algorithm can choose the order in which elements appear [9].
Intriguingly, a key aspect of prior advances on constant-competitive algorithms for special cases and variants of MSP is that they heavily rely on either knowing the full matroid \(\mathcal {M}\) upfront or that revealing elements also reveal additional information about the matroid. This is also crucially exploited in Soto’s work on RA-MSP. In fact, if the matroid is not known upfront in full, there is no natural general variant of MSP (such as the Random Assignment and Free Order models mentioned above) for which a constant-competitive algorithm is known.
A high reliance on knowing the matroid \(\mathcal {M}= (N, \mathcal {I})\) upfront (except for its size \(|N|\)) is undesirable when trying to approach the MSP Conjecture, because it is easy to obstruct an MSP instance by adding zero-weight elements. Not surprisingly, all prior advances on the general MSP conjecture, like the above-mentioned \(O(\log \log ({{\,\textrm{rank}\,}}(\mathcal {M})))\)-competitive algorithms [3, 4] and also earlier procedures [1, 15], only need to know \(|N|\) upfront and make calls to an independence oracle on elements revealed so far. Thus, for RA-MSP, it was raised as an open question, both in [8] and [14], whether a constant-competitive algorithm exists without knowing the matroid upfront. The key contribution of this work is to affirmatively answer this question, making the random assignment setting the first MSP variant for which a constant-competitive algorithm is known without knowing the matroid and without any restriction on the underlying matroid.
Theorem 1.2
There is a constant-competitive algorithm for RA-MSP with only the cardinality of the matroid known upfront.
Moreover, our result holds in the more general adversarial order with a sample setting, where we are allowed to sample a random constant fraction of the elements and all remaining (non-sampled) elements arrive in adversarial order. This is also referred to as an order-oblivious algorithm in the MSP literature (see, e.g., [4]).
As mentioned, when the matroid is fully known upfront, an O(1)-competitive algorithm was known for RA-MSP even when the arrival order of all elements is adversarial [14]. Interestingly, for this setting it is known that, without knowing the matroid upfront, no constant-competitive algorithm exists. More precisely, a lower bound on the competitiveness of \(\Omega ({\log |N|}/{\log \log |N|})\) was shown in [14].
1.1 Organization of the paper
We start in Sect. 2 with a brief discussion on the role of (matroid) densities in the context of random assignment models, as our algorithm heavily relies on densities. Decomposing the matroid into parts of different densities has been central in prior advances on RA-MSP. However, this crucially relies on knowing the matroid upfront. We work with a rank-density curve, introduced in Sect. 3.1, which is also unknown upfront; however, we show that it can be learned approximately (in a well-defined sense) by observing a random constant fraction of the elements. Section 3 provides an outline of our approach based on rank-density curves and presents the main ingredients allowing us to derive Theorem 1.2. Section 4 takes a closer look at rank-density curves and shows some of their useful properties. Section 5 showcases the main technical tool that allows us to approximate the rank-density curve from a sample set. Finally, Sect. 6 combines the ingredients to present our final algorithm and its analysis.
We emphasize that we predominantly focus on providing a simple algorithm and analysis, refraining from optimizing the competitive ratio of our procedure at the cost of complicating the presentation.
We assume that all matroids are loopless, i.e., every element is independent by itself. This is without loss of generality, as loops can simply be ignored in matroid secretary problems.
2 Random-assignment MSP and densities
A main challenge in the design and analysis of MSP algorithms is how to protect heavier elements (or elements of an offline optimum) from being spanned by lighter ones that are selected earlier during the execution of the algorithm. In the random assignment setting, however, weights are assigned to elements uniformly at random, which allows for shifting the focus from protecting elements based on their weights to protecting elements based on their role in the matroid structure. Intuitively speaking, an element arriving in the future is at a higher risk of being spanned by the algorithm’s prior selection if it belongs to an area of the matroid of large cardinality and small rank (a “dense” area) than an area of small cardinality and large rank (a “sparse” area).
This is formally captured by the notion of density: the density of a set \(U \subseteq N\) in a matroid \(\mathcal {M}= (N, \mathcal {I})\) is \({|U|}/{r(U)}\), where \(r :2^{N} \rightarrow \mathbb {Z}_{\ge 0}\) is the rank function of \(\mathcal {M}\).Footnote 2
Densities play a crucial role in RA-MSP [8, 14]. Indeed, prior approaches decomposed \(\mathcal {M}\) into its principal sequence, which is the chain \(\emptyset \subsetneq S_{1} \subsetneq \ldots \subsetneq S_{k} = N\) of sets of decreasing densities obtained as follows. \(S_{1} \subseteq N\) is the densest set of \(\mathcal {M}\) (in case of ties it is the unique maximal densest set), \(S_{2}\) is the union of \(S_{1}\) and the densest set in the matroid obtained from \(\mathcal {M}\) after contracting \(S_{1}\), and so on until a set \(S_{k}\) is obtained with \(S_{k} = N\). Figure 1a shows an example of the principal sequence of a graphic matroid.
Fig. 1a shows a graph representing a graphic matroid together with its principal sequence \(\emptyset \subsetneq S_{1} \subsetneq \dots \subsetneq S_{7} = N\), where N are all edges of the graph. Figure 1b shows its rank-density curve. Each step in the rank-density curve (highlighted by a circle) corresponds to one \(S_{i}\) and has y-coordinate equal to the density of \(\mathcal {M}_{i} = \left. \left( \mathcal {M} / S_{i - 1}\right) \right| _{S_{i} {\setminus } S_{i - 1}}\) and x-coordinate equal to \(r(S_{i})\)
Previous approaches then considered, independently for each \(i \in [k] {:=}\{1, \dots , k\}\), the matroid \(\mathcal {M}_{i} {:=}\left. \left( \mathcal {M} / S_{i - 1}\right) \right| _{S_i{\setminus } S_{i - 1}}\), i.e., the matroid obtained from \(\mathcal {M}\) by first contracting \(S_{i - 1}\) and then restricting to \(S_{i} \setminus S_{i - 1}\). (By convention, we set \(S_{0} {:=}\emptyset \).) These matroids are also known as the principal minors of \(\mathcal {M}\). Given an independent set in each principal minor, their union is guaranteed to be independent in the original matroid \(\mathcal {M}\). Prior approaches (see, in particular, [8] for details) then exploited the following two key properties of the principal minors \(\mathcal {M}_{i}\):
-
(i)
\(\sum _{i = 1}^{k} \mathbb {E}[w({\textrm{OPT}}(\mathcal {M}_{i}))] = \Omega (\mathbb {E}[w({\textrm{OPT}}(\mathcal {M}))])\), where \({\textrm{OPT}}(\mathcal {M})\) (and analogously \({\textrm{OPT}}(\mathcal {M}_{i})\)) is an (offline) maximum weight independent set in \(\mathcal {M}\) and the expectation is over all random weight assignments.
-
(ii)
Each matroid \(\mathcal {M}_{i}\) is uniformly dense, which means that the (unique maximal) densest set in \(\mathcal {M}_{i}\) is the whole ground set of \(\mathcal {M}_{i}\).
Property (i) guarantees that, to obtain an O(1)-competitive procedure, it suffices to compare against the (offline) optima of the matroids \(\mathcal {M}_i\). Combining this with property (ii) implies that it suffices to design a constant-competitive algorithm for uniformly dense matroids. Since uniformly dense matroids behave in many ways very similarly to uniform matroids, which are a special case of uniformly dense matroids, it turns out that MSP on uniformly dense matroids admits a simple yet elegant O(1)-competitive algorithm. (See [8] for details.)
3 Outline of our approach
As discussed, prior approaches [8, 14] for RA-MSP heavily rely on knowing the matroid upfront, as they need to construct its principal sequence upfront. A natural approach would be to observe a sample set \(S \subseteq N\) containing a constant fraction of all elements and then try to mimic the existing approaches using the principal sequence of \(\left. \mathcal {M}\right| _{S}\), the matroid \(\mathcal {M}\) restricted to the elements in S. A main hurdle lies in how to analyze such a procedure as the principal sequence of \(\left. \mathcal {M}\right| _{S}\) can differ significantly from the one of \(\mathcal {M}\). In particular, one can construct matroids where it is likely that there are parts whose density is underestimated by a super-constant factor. Moreover, \(\left. \mathcal {M}\right| _{S}\) may have many different densities not present in \(\mathcal {M}\) (e.g., when \(\mathcal {M}\) is uniformly dense).
We overcome these issues by not dealing with principal sequences directly, but rather using what we call the rank-density curve of a matroid, which captures certain key parameters of the principal sequence. As we show, rank-density curves have three useful properties:
-
(i)
They provide a natural way to derive a quantity that both relates to the offline optimum and can be easily compared against to bound the competitiveness of our procedure.
-
(ii)
They can be learned approximately by observing a random \(\Omega (1)\)-fraction of N.
-
(iii)
Approximate rank-density curves can be used algorithmically to protect denser areas from sparser ones without having to know the matroid upfront.
Section 3.1 introduces rank-density curves and shows how they conveniently allow for deriving a quantity that compares against the offline optimum. Section 3.2 then discusses our results on approximately learning rank-density curves and how this can be exploited algorithmically.
3.1 Rank-density curves
Given a matroid \(\mathcal {M}= (N, \mathcal {I})\), one natural way to define its rank-density curve \(\rho _{\mathcal {M}} :\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) is through its principal minors \(\mathcal {M}_{1}, \dots , \mathcal {M}_{k}\), which are defined through the principal sequence \(\emptyset \subsetneq S_{1} \subsetneq \dots \subsetneq S_{k} = N\) as explained in Sect. 2. For a value \(t \in (0, {{\,\textrm{rank}\,}}(\mathcal {M})]\), let \(i_{t} \in [k]\) be the smallest index such that \(r(S_{i_{t}}) \ge t\). The value \(\rho _{\mathcal {M}}(t)\) is then given by the density of \(\mathcal {M}_{i_{t}}\). (See Fig. 1b for an example.) In addition, we set \(\rho _{\mathcal {M}}(t) = 0\) for any \(t > {{\,\textrm{rank}\,}}(\mathcal {M})\).
A formally equivalent way to define \(\rho _{\mathcal {M}}\), which is more convenient for what we do later, is as follows. For any \(S \subseteq N\) and \(\lambda \in \mathbb {R}_{\ge 0}\), we define
to be the unique maximal maximizer of \(\max _{U \subseteq S}\{|U| - \lambda r(U)\}\). It is well-known that each set in the principal sequence \(S_{1}, \dots , S_{k}\) is nonempty and of the form \(D_{\mathcal {M}}(N, \lambda )\) for \(\lambda \in \mathbb {R}_{\ge 0}\). This leads to the following way to define the rank-density curve, which is the one we use in what follows.
Definition 3.1
(rank-density curve) Let \(\mathcal {M}= (N, \mathcal {I})\) be a matroid. Its rank-density curve \(\rho _{\mathcal {M}} :\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) is defined by
When the matroid \(\mathcal {M}\) is clear from context, we also simply write \(\rho \) instead of \(\rho _{\mathcal {M}}\) for its rank-density curve and \(D(N, \lambda )\) instead of \(D_{\mathcal {M}}(N, \lambda )\). Note that \(\rho \) is piecewise constant, left-continuous, and non-increasing. (See Fig. 1b for an example.) If \(\mathcal {M}\) is a uniformly dense matroid with density \(\lambda \), we have \(\rho (t) = \lambda \) for \(t\in (0,{{\,\textrm{rank}\,}}(\mathcal {M})]\) and \(\rho (t) = 0\) for \(t\in ({{\,\textrm{rank}\,}}(\mathcal {M}),\infty )\).
We now expand on how \(\rho _{\mathcal {M}}\) is related to the expected offline optimum value \(\mathbb {E}[{\textrm{OPT}}(\mathcal {M})]\) of an RA-MSP instance. To this end, we use the function \(\eta :[0, |N|] \rightarrow \mathbb {R}_{\ge 0}\) defined by
where \({{\,\mathrm{{\textrm{Unif}}}\,}}(N, \lfloor a \rfloor )\) is a uniformly random set of \(\lfloor a \rfloor \) many elements out of N (without repetitions); and we set \(\eta _w(a)=0\) for \(a \in [0,1)\) (i.e., when the set R above is empty) by convention. In words, \(\eta _w(a)\) is the expected maximum weight out of \(\lfloor a \rfloor \) weights chosen uniformly at random from all the weights \(\{w_{e}\}_{e \in N}\). Based on this notion, we assign the following value \(F_w(\rho )\) to a rank-density curve \(\rho \):
where the second equality holds because \(\rho (t) = 0\) for \(t> {{\,\textrm{rank}\,}}(\mathcal {M})\). Note that as the graph of \(\rho \) is a staircase, the above integral is just a finite sum. When the weights w are clear from the context, we usually write \(\eta \) instead of \(\eta _w\), and F instead of \(F_w\).
One key property of the function F, is that it can be used as a proxy for the expected value of the offline optimum. More precisely, the statement below shows that \(F(\rho )\) is at most a constant factor smaller than the offline optimum—see Sect. 4 for proof details. This is the direction we need to be able to compare the output of our algorithm against the offline optimum. Moreover, \(F(\rho )\) is also no more than a constant-factor larger, which we do not need for our derivations; however, this is in particular also a consequence of the fact that our algorithm returns an independent set of expected weight \(\Omega (F(\rho ))\). Lemma 3.2 is phrased in a slightly more general form that allows for applying it not just to the original matroid, but also to any minors thereof, which we need later.
Lemma 3.2
Let \(w_1,\dots , w_n\in \mathbb {R}_{\ge 0}\) be n weights, and let \(\mathcal {M}\) be a matroid with a ground set of size \(k\le n\). Assume we first choose a uniformly random subset of k weights among \(w_1,\dots , w_n\), and then assign these weights uniformly at random to the elements of \(\mathcal {M}\). Then \(\mathbb {E}[w({\textrm{OPT}}(\mathcal {M}))] \le \frac{3e}{e - 1} \cdot F(\rho _{\mathcal {M}})\).
Thus, to be constant-competitive, it suffices to provide an algorithm returning an independent set of expected weight \(\Omega (F(\rho ))\).
3.1.1 RA-MSP subinstances
We will often work with minors of the matroid that is originally given in our RA-MSP instance, and apply certain results to such minors instead of the original matroid. To avoid confusion, we fix throughout the paper one RA-MSP instance with matroid \(\mathcal {M}_{\textrm{orig}} = (N_{\textrm{orig}}, \mathcal {I}_{\textrm{orig}})\), whose ground set size we denote by \(n{:=}|N_{\textrm{orig}}|\), and whose elements have unknown but (adversarially) fixed weights \(w :N_{\textrm{orig}} \rightarrow \mathbb {R}_{\ge 0}\), and our goal is to design an O(1)-competitive algorithm for this one instance. The weights w of the original instance are the only weights we consider, even when working with RA-MSP subinstances on minors of \(\mathcal {M}_{\textrm{orig}}\), as their elements also obtain their weights uniformly at random from w. In particular, the function F as defined in (3) is always defined with respect to the original vector of n weights w.
To formally describe the type of matroids we get as subinstances, we introduce the notion of a matroid with w-sampled weights. More precisely, if \(\mathcal {M}\) is a matroid with a ground set size of \(k\le n\), then \(\mathcal {M}\) with w-sampled weights is a randomly weighted version of the matroid, where we pick a uniform subset of k among the n entries of w and assign them uniformly at random to the ground set of \(\mathcal {M}\). Clearly, any minor of \(\mathcal {M}\) is of this type.
Even though we may have \(k<n\), a matroid \(\mathcal {M}\) with w-sampled weights can be interpreted as a RA-MSP instance, as it corresponds to the adversary first choosing uniformly at random a subset of k weights among the weights in w, which then get assigned uniformly at random to the elements.
3.2 Proof plan for Theorem 1.2 via rank-density curves
We now expand on how one can learn an approximation \(\tilde{\rho }\) of the rank-density curve \(\rho _{\mathcal {M}_{\textrm{orig}}}\) and how this can be exploited algorithmically to return an independent set of expected weight \(\Omega (F(\rho _{\mathcal {M}_{\textrm{orig}}}))\), which by Lemma 3.2 implies O(1)-competitiveness of the procedure. To this end, we start by formalizing the notion of an approximate rank-density curve, which relies on the notion of downshift.
Definition 3.3
Let \(\rho :\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) be a non-increasing function and let \(\alpha , \beta \in \mathbb {R}_{\ge 1}\). The \((\alpha , \beta )\)-downshift \(\rho ':\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) of \(\rho \) is defined via an auxiliary function \(\phi :\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) as follows:
Moreover, a function \(\tilde{\rho }:\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) is called an \((\alpha , \beta )\)-approximation of \(\rho \) if it is non-increasing and \(\rho '\le \tilde{\rho }\le \rho \), where \(\rho '\) is the \((\alpha , \beta )\)-downshift of \(\rho \).
One helpful way to think about an \((\alpha ,\beta )\)-downshift is as a slightly modified version of \({\rho (\alpha \cdot t)}/{\beta }\). This is also where the name stems from, as \({\rho (\alpha \cdot t)}/{\beta }\) corresponds to shifting, when thinking in doubly logarithmic scale, the function \(\rho \) to the left and down, corresponding to the factors \(\alpha \) and \(\beta \), correspondingly. This function is then modified in two ways. First, for \(t\in (0,1]\), we lower its value to \({\alpha }/{\beta }\). This is done because we are not able to accurately estimate densities for low ranks. Fortunately, this turns out not to be an issue to obtain a constant-competitive algorithm because we can set off the loss of this modification by running, with some probability, the classical single secretary algorithm to return the heaviest element with constant probability. (This is in particular implied by Lemma 3.4 below and discussed right after.) The second modification is that we round up values in (0, 1). This reflects the fact that density values are always at least one.
One issue when working with an (O(1), O(1))-approximation \(\tilde{\rho }\) of \(\rho \) is that \(F(\tilde{\rho })\) may be more than a constant factor smaller than \(F(\rho )\), and we thus cannot compare against \(F(\tilde{\rho })\) to obtain an O(1)-competitive procedure. This happens due to the above-mentioned way how \((\alpha ,\beta )\)-downshifts are defined; more precisely, that values for \(t\in (0,1]\) got rounded down to \({\rho (\alpha )}/{\beta }\). However, as the following lemma shows, also in this case we can obtain a simple lower bound for the value \(F(\tilde{\rho })\) in terms of \(F(\rho )\) and the largest weight \(w_{\max }\) in w — a proof of the statement can be found at the end of Sect. 4.
Lemma 3.4
Let \(\mathcal {M}\) be a matroid with w-sampled weights, let \(\alpha ,\beta \in \mathbb {R}_{\ge 1}\), and let \(\tilde{\rho }\) be an \((\alpha , \beta )\)-approximation of \(\rho =\rho _{\mathcal {M}}\). Then \(F(\rho ) \le 2 \alpha \beta F(\tilde{\rho }) + \alpha w_{\max }\).
A key implication of Lemma 3.4 is that it suffices to obtain an algorithm that returns an independent set of expected weight \(\Omega (F(\tilde{\rho }))\) for some (O(1), O(1))-approximation \(\tilde{\rho }\) of \(\rho _{\mathcal {M}_{\textrm{orig}}}\). Indeed, Lemma 3.4 then implies \(F(\tilde{\rho }) = \Omega (F(\rho _{\mathcal {M}_{\textrm{orig}}})) - O(w_{\max })\). By running this algorithm with some probability (say 0.5) and otherwise Dynkin’s [2] classical secretary algorithm, which picks the heaviest element with constant probability, an overall algorithm is obtained that returns an independent set of expected weight \(\Omega (F(\rho _{\mathcal {M}_{\textrm{orig}}}))\). Hence, Lemma 3.4 helps to provide bounds on the competitiveness of algorithms that are competitive with the F-value of an approximate rank-density curve. This technique is also used in the following key statement, which shows that an algorithm with strong guarantees can be obtained if we are given an (O(1), O(1))-approximation of the rank-density curve of the matroid on which we work—see Sect. 6 for the proof.
Theorem 3.5
Let \(\mathcal {M}\) be a matroid with w-sampled weights, and let \(\rho _{\mathcal {M}}\) denote the rank-density curve of \(\mathcal {M}\). Assume we are given an \((\alpha , \beta )\)-approximation \(\tilde{\rho }\) of \(\rho _{\mathcal {M}}\) for integers \(\alpha \ge 24\) and \(\beta \ge 3\). Then there is an efficient procedure \({\textrm{ALG}}(\tilde{\rho }, \alpha , \beta )\) that, when run on the RA-MSP subinstance given by \(\mathcal {M}\), returns an independent set I of \(\mathcal {M}\) of expected weight at least \(\left( \tfrac{1}{1440 e \alpha ^{2} \beta ^{2}}\right) \left( F(\rho _{\mathcal {M}})- \alpha ^{2} w_{\max } \right) \).
The last main ingredient of our approach is to show that such an accurate proxy \(\tilde{\rho }\) can be computed with constant probability. More precisely, we show that, after observing a sample set S containing every element of \(N_{\textrm{orig}}\) independently with probability \({1}/{2}\), the rank-density curve of (the observed) \(\left. \mathcal {M}_{\textrm{orig}}\right| _{S}\)
-
Is close to the rank-density curve of \(\left. \mathcal {M}_{\textrm{orig}}\right| _{N_{\textrm{orig}} \setminus S}\), allowing us to use \(\rho _{\left. \mathcal {M}_{\textrm{orig}}\right| _{S}}\) as desired proxy for the RA-MSP subinstance given by \(\left. \mathcal {M}_{\textrm{orig}}\right| _{N_{\textrm{orig}} \setminus S}\), and
-
Is close to the rank-density curve of \(\mathcal {M}_{\textrm{orig}}\), which allows for relating the offline optimum of the RA-MSP subinstance given by \(\left. \mathcal {M}_{\textrm{orig}}\right| _{N_{\textrm{orig}} \setminus S}\) to the one of \(\mathcal {M}_{\textrm{orig}}\).
We highlight that the next result is purely structural and hence independent of weights or the MSP setting. See Sect. 5 for details.
Theorem 3.6
Let \(\mathcal {M}=(N, \mathcal {I})\) be a matroid and \(S \subseteq N\) be a random set containing every element of N independently with probability \({1}/{2}\). Then, with probability at least \({1}/{100}\), \(\rho _{\left. \mathcal {M}\right| _{S}}\) and \(\rho _{\left. \mathcal {M}\right| _{N {\setminus } S}}\) are both (288, 9)-approximations of \(\rho _{\mathcal {M}}\).
Combining the above results, we get the desired O(1)-competitive algorithm.
Proof of Theorem 1.2
For brevity, let \(\mathcal {M}{:=}\mathcal {M}_{\textrm{orig}}\) and \(N {:=}N_{\textrm{orig}}\) throughout this proof. Recall that by Lemma 3.2, it suffices to provide an algorithm returning an independent set of expected weight \(\Omega (F(\rho _{\mathcal {M}}))\). Consider the following procedure: First observe (without picking any element) a set \(S \subseteq N\) containing every element of N independently with probability \({1}/{2}\) and let \(\tilde{\rho }\) denote the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{S}}\). Then run the algorithm described in Theorem 3.5 on \(\left. \mathcal {M}\right| _{N {\setminus } S}\) with \(\tilde{\rho }\) as the approximate rank-density curve. Let I denote the output of the above procedure and let \(\mathcal {A}\) be the event defined in Theorem 3.6, that is,
A key property we exploit is that for any \(S\in \mathcal {A}\), we have that \(\tilde{\rho }\) is a \((288^2, 9^2)\)-approximation of \(\rho _{\left. \mathcal {M}\right| _{N\setminus S}}\) due to the following. First, because \(\tilde{\rho }\) is the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{S}} \le \rho _{\mathcal {M}}\), and \(\rho _{\left. \mathcal {M}\right| _{N \setminus S}}\) is a (288, 9)-approximations of \(\rho _{\mathcal {M}}\), we have \(\rho _{\left. \mathcal {M}\right| _{N \setminus S}} \ge \tilde{\rho }\). Moreover, the approximation parameter \((288^{2}, 9^{2})\) follows by using that the \((\alpha _{2}, \beta _{2})\)-downshift of the \((\alpha _{1}, \beta _{1})\)-downshift of some rank-density function is an \((\alpha _{1} \alpha _{2}, \beta _{1} \beta _{2})\)-approximation of that rank-density function — see Lemma 4.3 for a proof of this property. This property can be applied as follows. Let \(\bar{\rho }\) be the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{N\setminus S}}\le \rho _{\mathcal {M}}\). Because \(\rho _{\left. \mathcal {M}\right| _{S}}\) is a (288, 9)-approximation of \(\rho _{\mathcal {M}}\), we have \(\bar{\rho }\le \rho _{\left. \mathcal {M}\right| _{S}}\). Hence, because \(\tilde{\rho }\) is the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{S}}\), it lies above the (288, 9)-downshift of \(\bar{\rho }\), which is itself the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{N\setminus S}}\). Thus, by the above property, we obtain that \(\tilde{\rho }\) is a \((288^2,9^2)\)-approximation of \(\rho _{\left. \mathcal {M}\right| _{N{\setminus } S}}\) as claimed.
Using this fact, we obtain for any fixed \(S\in \mathcal {A}\)
where the first inequality follows from Theorem 3.5 and the fact that \(\tilde{\rho }\) is a \((288^2,9^2)\)-approximation of \(\rho _{\left. \mathcal {M}\right| _{N{\setminus } S}}\) as discussion above, while the second inequality follows from Lemma 3.4 and the fact that, for every \(S \in \mathcal {A}\), the curve \(\rho _{\left. \mathcal {M}\right| _{N \setminus S}}\) is a (288, 9)-approximation of \(\rho _{\mathcal {M}}\). Moreover, the first inequality uses that conditioning on any fixed \(S \in \mathcal {A}\) does not have any impact on the uniform assignment of the weights w to the elements. This holds because the event \(\mathcal {A}\) only depends on the sampled elements S but not the weights of its elements. Hence, the RA-MSP subinstance given by \(\left. \mathcal {M}\right| _{N\setminus S}\) on which we use the algorithm described in Theorem 3.5 indeed assigns weights of w uniformly at random to elements, as required. It then follows that the output of the above procedure satisfies
where the last inequality uses that \(\Pr [\mathcal {A}] \ge {1}/{100}\) by Theorem 3.6.
Since running the classical secretary algorithm on \(\mathcal {M}_{\textrm{orig}}\) returns an independent set of expected weight at least \({w_{\max }}/{e}\), by running the procedure described above with probability \({1}/{2}\), and running the classical secretary algorithm otherwise, we return an independent set of expected weight at least
where the last inequality follows from Lemma 3.2. \(\square \)
4 Rank-density curves and their properties
In this section we take a closer look at rank-density curves, and prove Lemmas 3.2 and 3.4. We start by stating some useful properties of the function \(\eta \).
Lemma 4.1
Let \(\eta \) be as defined in (2). Then
-
(i)
\(\eta \) is non-decreasing.
-
(ii)
\(\eta (ah) \le 2a \eta (h)\) for all \(a \ge 1\) and \(h \in [1, \frac{n}{a}]\).
-
(iii)
Let \(X \sim B(m, p)\) with \(1 \le mp \le m \le n\). Then \(\mathbb {E}[\eta (X)] \le 3 \eta (mp)\).
Proof
-
(i)
This follows immediately from the definition of \(\eta \).
-
(ii)
Consider a consecutive numbering of the n entries of the weight vector \(w\in \mathbb {R}^n\) in non-decreasing order, i.e., \(w_1 \le w_2 \le \cdots \le w_n\). For each \(k, i \in [n]\) let p(k, i) denote the probability that among k samples, element i is the heaviest and has the smallest index among the heaviest elements (in case of ties). In other words, p(k, i) is the probability that \(w_{i}\) is the heaviest weight out of k sampled ones and none of \(w_{j}\) with \(j < i\) are in the sample. Thus for \(i < k\) we have \(p(k, i) = 0\) and for \(i \ge k\) we have
$$ p(k, i) = \frac{\left( {\begin{array}{c}i - 1\\ k - 1\end{array}}\right) }{\left( {\begin{array}{c}n\\ k\end{array}}\right) } = k \frac{(i - 1)! (n - k)!}{n! (i - k)!} = \frac{k}{n} \prod _{j = 1}^{k - 1} \frac{i - j}{n - j}. $$Next, note that for all \(h \in [1, \frac{n}{a}]\) we have
$$\begin{aligned} \eta (h)&= \sum _{i = 1}^{n} p(\lfloor h \rfloor , i) w_{i}, {\text { and}}\\ \eta (ah)&= \sum _{i = 1}^{n} p(\lfloor ah \rfloor , i) w_{i}. \end{aligned}$$Our goal is to show that \(p(\lfloor ah \rfloor , i) \le 2 a p(\lfloor h \rfloor , i)\) for all \(i \in [n]\), which is sufficient to prove the desired inequality. To this end, note that for \(i < \lfloor ah \rfloor \) we have \(p(\lfloor ah \rfloor , i) = 0 \le a p(\lfloor h \rfloor , i)\), and for \(i \ge \lfloor ah \rfloor \) we have
$$\begin{aligned} p(\lfloor ah \rfloor , i)&= \frac{\lfloor ah \rfloor }{n} \prod _{j = 1}^{\lfloor ah \rfloor - 1} \frac{i - j}{n - j} \le \frac{\lfloor ah \rfloor }{n} \prod _{j = 1}^{\lfloor h \rfloor - 1} \frac{i - j}{n - j} \\&\le 2a \cdot \frac{\lfloor h \rfloor }{n} \prod _{j = 1}^{\lfloor h \rfloor - 1} \frac{i - j}{n - j} = 2a p(\lfloor h \rfloor , i). \end{aligned}$$Here the first inequality follows from \(\lfloor h \rfloor \le \lfloor ah \rfloor \) and the fact that \(\frac{i - j}{n - j} \le 1\) for any \(j < n\), and the second inequality uses that \(\lfloor ah \rfloor \le ah \le 2a \lfloor h \rfloor \) because \(h \le 2 \lfloor h \rfloor \) for \(h \in \mathbb {R}_{\ge 1}\).
-
(iii)
Note that \(\eta (h) \le {2\,h}/{mp} \cdot \eta (mp)\) for all \(h \in [mp, n]\) by property (ii). Therefore,
$$\begin{aligned} \mathbb {E}\left[ \eta (X)\right]&= \sum _{h = 0}^{\lfloor mp \rfloor } \eta (h) \cdot \Pr \left[ X = h\right] + \sum _{h = \lfloor mp \rfloor + 1}^{m} \eta (h) \cdot \Pr \left[ X = h\right] \\&\le \eta (mp) \cdot \Pr \left[ X \le \lfloor mp \rfloor \right] + \sum _{h = \lfloor mp \rfloor + 1}^{m} 2 \frac{h}{mp} \cdot \eta (mp) \cdot \Pr \left[ X = h\right] \\&= \eta (mp) \cdot \Pr \left[ X \le \lfloor mp \rfloor \right] + 2 \frac{\eta (mp)}{mp} \cdot \sum _{h = \lfloor mp \rfloor + 1}^{m} h \cdot \Pr \left[ X = h\right] \\&\le \eta (mp) + 2 \frac{\eta (mp)}{mp} \cdot \mathbb {E}\left[ X\right] = 3\eta (mp). \end{aligned}$$
\(\square \)
In order to prove Lemma 3.2, we use the following result from [8], which relies on the notion of the random partition matroid \({\mathcal {P}} = (N, \mathcal {I}')\) associated to a given matroid \(\mathcal {M}= (N, \mathcal {I})\). The former is constructed as follows. First, every element \(e \in N\) is assigned to one of \(\texttt{rank}(\mathcal {M})\) many classes \(P_{1}, \dots , P_{\texttt{rank}(\mathcal {M})}\) uniformly at random and independently of each other. A set \(S \subseteq N\) is then independent in \({\mathcal {P}}\) if it contains at most one element from each class, i.e., \(S \in \mathcal {I}'\) if \(|S \cap P_{i}| \le 1\) for every \(i \in [{\texttt {rank} }(\mathcal {M})]\). The next result relates the value of the offline OPT of \(\mathcal {M}\), with that of the random partition matroids associated to the principal minors of \(\mathcal {M}\).
Lemma 4.2
([8, Lemma 4.2]) Let \((\mathcal {M}, w)\) be a random-assignment MSP instance. Let \(\{\mathcal {M}_{i}\}_{i = 1}^{k}\) be the principal minors of \(\mathcal {M}\) and let \({\mathcal {P}}_{i}\) denote the random partition matroid associated to each \(\mathcal {M}_{i}\), respectively. Then
where the expectations are taken with respect to the random weight assignment and the random partitioning (this only applies to the right-hand side expectation).
Now we are ready to prove Lemma 3.2.
Proof of Lemma 3.2
Let \((\mathcal {M}'_{i})_{i = 1}^{k}\) be the principal minors of \(\mathcal {M}'\) and let \(n'_{i}\) and \(r'_{i}\) denote the cardinality and rank of each \(\mathcal {M}'_{i}\), respectively. Additionally, for every \(i \in [k]\), let \({\mathcal {P}}'_{i}\) be the random partition matroid associated to \(\mathcal {M}'_{i}\) and let \((P'_{i, j})_{j = 1}^{r'_{i}}\) denote the (random) partitions of \({\mathcal {P}}'_{i}\). By Lemma 4.2 we have
Next, note that for every \(i \in [k]\) we have
where the first equality follows from the definition of \({\mathcal {P}}'_{i}\) and linearity of expectation, the second one uses the definition of \(\eta _{w}\), and the third one holds since \(|P'_{i, j}| \sim B(n'_{i}, 1 / r'_{i})\) for every \(j \in [r'_{i}]\). Thus, by applying Lemma 4.1 and using that by construction every \(\mathcal {M}'_{i}\) is uniformly dense with density \(\lambda '_{i} = n'_{i} / r'_{i}\), we get:
Hence
where the equality holds by definition of \(F_{w}\). \(\square \)
The following result shows that, simply put, in terms of Definition 3.3, an approximation of an approximation of a function is also an approximation of the original function, where the approximation parameters \(\alpha \) and \(\beta \) are multiplied.
Lemma 4.3
Let \(\rho _{1}, \rho _{2}, \rho _{3} :\mathbb {R}_{> 0} \rightarrow \mathbb {R}_{\ge 0}\) be non-increasing functions such that \(\rho _{2}\) is an \((\alpha _{1}, \beta _{1})\)-approximation of \(\rho _{1}\) and \(\rho _{3}\) is an \((\alpha _{2}, \beta _{2})\)-approximation of \(\rho _{2}\) for some parameters \(\alpha _{1}, \beta _{1}, \alpha _{2}, \beta _{2} \in \mathbb {R}_{\ge 1}\). Then \(\rho _{3}\) is an \((\alpha _{1} \alpha _{2}, \beta _{1} \beta _{2})\)-approximation of \(\rho _{1}\).
Proof
Let \(\rho _{1}'\) be the \((\alpha _{1}, \beta _{1})\)-downshift of \(\rho _{1}\) and let \(\phi _{1}\) be the auxiliary function used to construct \(\rho _{1}'\) in Definition 3.3. Similarly, let \(\rho _{2}'\) and \(\phi _{2}\) be the \((\alpha _{2}, \beta _{2})\)-downshift of \(\rho _{2}\) and the corresponding auxiliary function, respectively, and let \(\rho _{1}''\) and \(\phi _{1}'\) be the \((\alpha _{1} \alpha _{2}, \beta _{1} \beta _{2})\)-downshift of \(\rho _{1}\) and the corresponding auxiliary function, respectively.
Since \(\rho _{3} \le \rho _{2} \le \rho _{1}\), it only remains to show that \(\rho _{3} \ge \rho _{1}''\). Observe that to obtain this bound it suffices to prove the following two properties (for a function g taking real values, we denote by \({{\,\textrm{supp}\,}}(g)\) the non-zero support of g, that is all points t in its domain for which \(g(t) \ne 0\)):
-
(i)
\(\rho _{3}(t) \ge \phi _{1}'(t)\) for every \(t > 0\), and
-
(ii)
\(\rho _{3}(t) \ge 1\) for every \(t \in {{\,\textrm{supp}\,}}(\rho _{1}'')\).
Indeed, on the one hand, for every \(t > 0\) such that \(\phi _{1}'(t) = \rho _{1}''(t)\) the first property implies \(\rho _{3}(t) \ge \rho _{1}''(t)\). On the other hand, for every \(t > 0\) such that \(\phi _{1}'(t) \ne \rho _{1}''(t)\) we have \(\rho _{1}''(t) = 1\), so \(\rho _{3}(t) \ge \rho _{1}''(t)\) follows from the second property.
First, we prove property (i). To this end, observe that for every \(t > 0\) we have \(\rho _{3}(t) \ge \rho _{2}'(t) \ge \phi _{2}(t)\) and \(\rho _{2}(t) \ge \rho _{1}'(t) \ge \phi _{1}(t)\). Therefore,
Here the first and second inequalities hold by the above observation, the first and the last equalities hold by construction of \(\phi _{2}\) and \(\phi _{1}'\), respectively, and the second equality holds by construction of \(\phi _{1}\) and the fact that \(\alpha _{2} \ge 1\).
Now, let us show property (ii). First, note that for every \(t \in {{\,\textrm{supp}\,}}(\rho _{2}')\) by construction we have \(\rho _{3}(t) \ge \rho _{2}'(t) \ge 1\). Next, note that by construction
Since \({{\,\textrm{supp}\,}}(\rho _{2}) \supseteq {{\,\textrm{supp}\,}}(\rho _{1}')\), this implies
Similarly, note that
Thus, the condition \(\alpha _{2} \notin {{\,\textrm{supp}\,}}(\rho _{1}')\) is equivalent to the following: either \(\alpha _{1} \notin {{\,\textrm{supp}\,}}(\rho _{1})\) or \(\alpha _{1} \alpha _{2} t \notin {{\,\textrm{supp}\,}}(\rho _{1})\). Note that the latter condition is more restrictive, as \(\alpha _{2} \ge 1\). Therefore, we get:
where the equality follows by construction of \({{\,\textrm{supp}\,}}(\rho _{1}'')\). Thus, \(\rho _{3}(t) \ge 1\) holds for every \(t \in {{\,\textrm{supp}\,}}(\rho _{2}') \supseteq {{\,\textrm{supp}\,}}(\rho _{1}'')\), so property (ii) holds. \(\square \)
We conclude this section by providing a proof of Lemma 3.4.
Proof of Lemma 3.4
First observe that the statement trivially holds if we have \(\alpha \ge {{\,\textrm{rank}\,}}(\mathcal {M})\), because \(F(\rho ) \le \alpha w_{\max }\). Hence, in what follows we assume \(\alpha < {{\,\textrm{rank}\,}}(\mathcal {M})\).
Let \(\rho ':\mathbb {R}_{>0} \rightarrow \mathbb {R}_{\ge 0}\) be the \((\alpha ,\beta )\)-downshift of \(\rho \). By definition of the function F and because \(\tilde{\rho } \ge \rho '\), we get
where the equality at the end follows from the fact that \(\rho '(t)=0\) for \(t > {{{\,\textrm{rank}\,}}(\mathcal {M})}/{\alpha }\). We now distinguish between \(\beta \ge n\) and \(\beta < n\).
Consider first the case \(\beta \ge n\). In this case we can expand as follows:
where the first inequality holds because \(\rho '\) is at least 1 when it is non-zero and \(\rho '\) is non-zero within \([0,{{{\,\textrm{rank}\,}}(\mathcal {M})}/{\alpha }]\) because \(\alpha \le {{\,\textrm{rank}\,}}(\mathcal {M})\), the second inequality uses \(\beta \ge n\), the third one is due to \(2n\eta (1) \ge \eta (n)\)—which is a consequence of Lemma 4.1 (ii)—the equality thereafter uses \(\eta (n)=w_{\max }\) and the final inequality follows from \({{\,\textrm{rank}\,}}(\mathcal {M})\cdot w_{\max } \ge F(\rho )\). The above relation together with Eq. (4) implies the desired result when \(\beta \ge n\).
Now assume \(\beta < n\). In this case we continue as follows:
where the first equality is a consequence of \(\rho '\) being the \((\alpha ,\beta )\)-downshift of \(\rho \), the second equality follows from a variable substitution, the second inequality uses \(2\beta \eta (\max \{{\rho (t)}/{\beta },1\}) \ge \eta (\max \{\rho (t),\beta \})\), which holds due to Lemma 4.1 (ii) (here we use \(\beta \le n\) to fulfill the conditions of this statement), and in the last inequality we use the definition of F and the fact that the function \(\eta \) never takes values larger than \(w_{\max }\). The above relation together with Eq. (4) implies the desired result for \(\beta \le n\), which finishes the proof. \(\square \)
5 Learning rank-density curves from a sample
One of the main challenges when designing and analyzing algorithms for MSP is understanding what kind of (and how much) information can be learned about the underlying instance after observing a random sample of it.
In this section, we show that, with constant probability, after observing a sample set S containing each element with probability 0.5, one can learn a good approximation of the rank-density curve of both \(\mathcal {M}\) and \(\left. \mathcal {M}\right| _{N \setminus S}\), thus proving Theorem 3.6. However, even if one knew the exact (instead of an approximate) rank-density curve of \(\left. \mathcal {M}\right| _{N \setminus S}\), given that the matroid is not known upfront (and hence neither which elements are associated to each of the different density areas of the curve), it is a priori not clear how to proceed. A second main contribution of this section is to show that the set of elements in \(N \setminus S\) that are spanned by a subset of S of a given density is well-structured. In particular, this will allow us to build a (chain) decomposition \(\bigoplus _{i = 1}^{k} \mathcal {M}_{i}\) of \(\left. \mathcal {M}\right| _{N {\setminus } S}\) where all the \(\mathcal {M}_{i}\)’s satisfy some desired properties with constant probability — see Sect. 6.1 for details.
The main technical contribution in this section is the following result.
Theorem 5.1
Let \(\mathcal {M}= (N, \mathcal {I})\) be a matroid containing 3h disjoint bases for some \(h \in \mathbb {Z}_{\ge 1}\). Let \(S \sim B(N, {1}/{2})\). Then
Proof
We prove the concentration result (5) first. Let \(\mathcal {M}_{h} = (N, \mathcal {I}_{h})\) denote the h-fold union of \(\mathcal {M}\) and let \(r_{h}\) denote its rank function. Consider the procedure described in Algorithm 1, which is loosely inspired by [16].

Note that the following three properties hold at all times: W, G, and C are pairwise disjoint; \(W \subseteq S\) and \(C \subseteq S\), while \(G \cap S = \emptyset \); and \(W \in \mathcal {I}_{h}\). In addition, by construction, at the end of the procedure we have:
-
(i)
\(S = C \uplus W\). Moreover, the random sets G and C have identical distributions, because each element belongs to S with probability \({1}/{2}\) independently of the other elements.
-
(ii)
\(G \subseteq {\textrm{span}}(D(S, h)) \setminus S\). Because \(G \cap S = \emptyset \), it is enough to show \(G \subseteq {\textrm{span}}(D(S, h))\). Given an arbitrary \(e \in G\), by construction we have \(W \cup \{e\} \notin \mathcal {I}_{h}\), i.e., \(r_{h}(W \cup \{e\}) = r_{h}(W)\). As \(W \subseteq S\), this yields \(r_{h}(S \cup \{e\}) = r_{h}(S)\), which then implies \(e \in {\textrm{span}}(D(S, h))\). The latter implication follows by a standard matroid argument; we provide a proof in Lemma A.1 for completeness.
As \(G\subseteq {\textrm{span}}(D(S,h))\setminus S\), and G and C have the same distribution, we get
Moreover,
where the equality follows from \(S = C \uplus W\), the first inequality from \(W \in \mathcal {I}_{h}\) (which implies \(|W| = r_{h}(W) \le h r(N)\)), and the last one from the fact that \(\mathcal {M}\) contains 3h many disjoint bases (and hence \(|N| \ge 3 h r(N)\)).
Combining (7) and (8) we obtain
where the second inequality follows from \(\mathbb {E}[|S|]={|N|}/{2}\). Relation (5) now follows by applying a Chernoff bound \(\Pr [X \le (1 - \delta )\mathbb {E}[X]] < \texttt{exp}[{-\delta ^{2} \mathbb {E}[X]}/{2}]\) for \(X = |S|\) to the right-hand side expression in (9) and using \(\mathbb {E}[|S|] = {|N|}/{2}\).
We next prove the concentration result (6). Let B be a union of 3h disjoint bases contained in \(\mathcal {M}\). We first show that for any set \(A \subseteq B\) the following inequality holds:
To this end, we start by observing that, because \(\left. \mathcal {M}\right| _{B}\) is a uniformly dense matroid with density 3h, we have
The above property also immediately follows by observing that, if we write \(B=B_1 \cup \cdots \cup B_{3h}\) as the union of 3h disjoint bases, then we have
Relation (10) now holds due to the following:
where the first inequality is due to (11) and the second one follows from the definition of D(A, h)).
Now, in order to show (6), suppose that the event \(r(D(S, h)) \le {r(N)}/{8}\) occurs. Then
where the third inequality holds by (10). Since rearranging the terms in the above expression gives \(|S \cap B| \le {5\,h r(N)}/{4}\), we have \(\Pr \left[ r(D(S, h)) \le {r(N)}/{8}\right] \le \Pr \left[ |S \cap B| \le {5\,h r(N)}/{4}\right] \). To upper bound the latter probability, observe that \(S \cap B\) contains each element of B with probability \({1}/{2}\) independently by construction of S. Therefore we can apply a Chernoff bound \(\Pr \left[ X \le (1 - \delta ) \mathbb {E}[X]\right] \le \exp \left( -{\delta ^{2}\mathbb {E}[X]}/{2}\right) \) with \(X = |S \cap B|\), \(\mathbb {E}[X] = {|B|}/{2} = {3\,h r(N)}/{2}\), and \(\delta = {1}/{6}\) (chosen so that \((1 - \delta ) \mathbb {E}[X] = {5\,h r(N)}/{4}\)), resulting in
where the last inequality holds since \(h \ge 1\). \(\square \)
The proof of Theorem 3.6 is based on the concentration result (6). In summary, rather than directly showing that \(\rho _{\left. \mathcal {M}\right| _{S}}\) approximates \(\rho _{\mathcal {M}}\) well everywhere, we consider a discrete set of points on \(\rho _{\mathcal {M}}\) associated to minors of \(\mathcal {M}\) of geometrically increasing ranks. We then apply (6) to these minors and employ a union bound to show that we get a good approximation for these grid points. The union bound works out because the ranks are geometrically increasing and appear in the exponent of the right-hand side of (6). The complete proof is presented below.
Proof of Theorem 3.6
Let \(\{\lambda _i\}_{i \in [m]}\) denote the densities (i.e., values of at least one) in the image of \(\rho _{\mathcal {M}}\), and let \(\tau {:=}\max \{t>0 :\rho _{\mathcal {M}}(t) \ge 1\}\). Let \(\tilde{\rho }:\mathbb {R}_{>0} \rightarrow \mathbb {R}_{\ge 0}\) be the curve obtained from \(\rho _{\mathcal {M}}\) by rounding down every density \(\lambda _i\) to the closest power of 3. That is,
Let \(\tilde{\lambda }_1> \cdots >\tilde{\lambda }_{\ell }\) denote the densities in the image of \(\tilde{\rho }\), and note that by construction all the \(\tilde{\lambda }_i\) are powers of 3. In particular, \(\tilde{\rho }\) is a (1, 3)-approximation of \(\rho _{\mathcal {M}}\).
Next, let \(r_{\max } :\mathbb {R}_{\ge 0} \rightarrow \mathbb {R}_{\ge 0}\) be the function given by
We define a subset of densities of \(\{\tilde{\lambda }_i\}_{i \in [\ell ]}\) as follows. First, set \(\mu _1=\tilde{\rho }(36)\) and \(i=1\). Then, while \(\tilde{\rho }(36 r_{\max }(\mu _i)) \ge 1\) (i.e., \(\tilde{\rho }(36 r_{\max }(\mu _i))\) is in the non-zero support of \(\tilde{\rho }\)), set \(\mu _{i+1} = \tilde{\rho }(36 r_{\max }(\mu _i))\) and update \(i=i+1\). Let \(\Lambda \) denote the subset of densities selected by the above procedure and let \(q{:=}|\Lambda |\).
Now, for each \(i \in [q]\), define \(r_i {:=}r(D(N,\mu _i))\). Note that \(r_i = r_{\max }(\mu _i)\). We then call a subset \(S \subseteq N\) good if it satisfies
The motivation for the above definition is that any good set S satisfies that \(\rho _{\left. \mathcal {M}\right| _{S}}\) is a (288, 9)-approximation of \(\rho _{\mathcal {M}}\). To see this, first note that if S is good then \(\rho _{\left. \mathcal {M}\right| _{S}} (t) \ge {\mu _i}/{3}\) for all \(t\in (0,{r_i}/{8}]\). Next, let \(\mu _{i-1},\mu _i \in \Lambda \) and \(t\in ({r_{i-1}}/{8},{r_{i}}/{8}]\). Then \(288t > 36r_{i-1}\), and thus \(\tilde{\rho }(288t) \le \tilde{\rho }(36r_{i-1})=\mu _i\), where the last equality follows by construction of the \(\mu _i\)’s. Using that \(\tilde{\rho }\) is a (1, 3)-approximation of \(\rho _{\mathcal {M}}\), it follows that \(\rho _{\mathcal {M}}(288t)\le 3 \tilde{\rho }(288t) \le 3 \mu _i\). Combining all the above, it follows that
Moreover, notice that for \(t\in (1,{r_1}/{8}]\) we have \(\tilde{\rho }(288t) \le \tilde{\rho }(36) = \mu _1\). Hence, using the same reasoning as above, it follows that \(\rho _{\left. \mathcal {M}\right| _{S}} (t) \ge {\mu _1}/{3} \ge {\rho _{\mathcal {M}}(288t)}/{9}\). Finally, consider the case where \(t > {r_q}/{8}\). By construction of the \(\mu _i\) we have \(\tilde{\rho }(288t) \le \tilde{\rho }(36r_q) = 0\). Hence \(\rho _{\mathcal {M}}(288t)=\tilde{\rho }(288t)=0\), since \(\tilde{\rho }\) is a (1, 3)-approximation of \(\rho _{\mathcal {M}}\) and thus \(\tilde{\rho }(a)=0 \iff \rho _{\mathcal {M}}(a)=0\) for any \(a>0\). It follows that \(\rho _{\left. \mathcal {M}\right| _{S}} (t) \ge 0 = \rho _{\mathcal {M}}(288t)\). This concludes the proof of the claim that if S is good, then \(\rho _{\left. \mathcal {M}\right| _{S}}\) is a (288, 9)-approximation of \(\rho _{\mathcal {M}}\).
Hence, in order to prove the theorem it remains to show that the probability of a set \(S\subseteq N\) being good (i.e., satisfying (12)) is at least \({1}/{100}\). We discuss this next. First, note that by Eq. (6) from Theorem 5.1, for each \(\mu _i \in \Lambda \) with \(\mu _i \ge 3\) it holds
In the case where \(\mu _q=1\), while we cannot directly black box the concentration bound from Theorem 5.1 since the assumptions are not met, we can still get the same bound as follows. Note that if \(\mu _q=1\), then \(r(D(S, {\mu _q}/{3}))\) is just r(S), and \(r_q = r(N)\). Let B be any basis of \(\mathcal {M}\), and observe that \(r(S) \ge |S \cap B|\). Since S contains every element of N independently with probability \({1}/{2}\), we can use a Chernoff bound \(\Pr \left[ X \le (1 - \delta ) \mathbb {E}[X]\right] \le \exp \left( -{\delta ^{2}\mathbb {E}[X]}/{2}\right) \) with \(X = |S \cap B|\), \(\mathbb {E}[X] = {|B|}/{2} = {r(N)}/{2}\), and \(\delta = {3}/{4}\) (chosen so that \((1 - \delta ) \mathbb {E}[X] = {r(N)}/{8}\)), resulting in
Finally, note that by construction we have \(r_1\ge 36\) and \(r_{i+1} \ge 36 r_i\) for all \(i,i+1 \in [q]\). Thus, \(r_i\ge 36^i\) for each \(i \in [q]\), and hence by the union bound it follows
Hence,
which concludes the proof. \(\square \)
6 The main algorithm and its analysis
In this section we describe and analyse the procedure from Theorem 3.5. The analysis consists of two main ingredients. The first one is to show that if the approximate curve \(\tilde{\rho }\) is well-structured (in some well-defined sense), then there is an algorithm retrieving a constant factor of \(F(\tilde{\rho })\) on expectation — see Theorem 6.1. The second one is then to show that, given any initial approximate curve \(\tilde{\rho }\), one can find well-structured curves whose F function value is close to \(F(\tilde{\rho })\) — see Theorem 6.2.
The next result, proved in Sect. 6.1, formalizes the first step above.
Theorem 6.1
Let \(\mathcal {M}=(N,\mathcal {I})\) be a matroid with w-sampled weights, and let r and \(\rho _{\mathcal {M}}\) denote the rank function and rank-density curve of \(\mathcal {M}\), respectively. Let \(\overline{\rho }\le \rho _{\mathcal {M}}\) be a rank-density curve with densities \(\{\overline{\lambda }_{i}\}_{i \in [m]}\) such that the \(\overline{\lambda }_{i}\) are powers of some integer \(\beta \ge 3\) and \(\overline{\lambda }_{1}> \cdots > \overline{\lambda }_{m} \ge 1\), and such that \(r(D(N, \overline{\lambda }_{i + 1})) \ge 24 r(D(N, {\overline{\lambda }_{i}}/{\beta }))\) for \(i \in [m - 1]\). Then there is an efficient procedure \({\textrm{ALG}}(\overline{\rho }, \beta )\) that, when run on the RA-MSP subinstance given by \(\mathcal {M}\), returns an independent set I of \(\mathcal {M}\) of expected weight at least \(({1}/{180e}) F(\overline{\rho })\).
We note that the above result assumes the \(\overline{\lambda }_{i}\) to be powers of \(\beta \) mainly for convenience (so that \({\overline{\lambda }_{i}}/{\beta }\) is an integer), but it is not strictly needed.
The second main ingredient in the proof of Theorem 3.5 is the following result.
Theorem 6.2
Let \(\mathcal {M}=(N,\mathcal {I})\) be a matroid with w-sampled weights, and let r and \(\rho _{\mathcal {M}}\) denote the rank function and rank-density curve of \(\mathcal {M}\), respectively. Given an \((\alpha , \beta )\)-approximate curve \(\tilde{\rho }\) of \(\rho _{\mathcal {M}}\) with \(\alpha \in \mathbb {R}_{\ge 24}\) and \(\beta \in \mathbb {Z}_{\ge 3}\), there is a procedure \({\textrm{ALG}}(\tilde{\rho }, \alpha , \beta )\) returning rank-density curves \(\overline{\rho }, \overline{\rho }_{1}, \overline{\rho }_{2}, \overline{\rho }_{3}, \overline{\rho }_{4}\) such that:
-
(i)
\(\overline{\rho }\) is an \((\alpha ^2, \beta ^2)\)-approximation of \(\rho _{\mathcal {M}}\).
-
(ii)
\(\sum _{i \in [4]} F(\overline{\rho }_{i}) \ge F(\overline{\rho })\).
-
(iii)
For each \(i\in [4]\), \(\overline{\rho }_{i}\) satisfies the following properties: Let \(\{\mu _j\}_{j \in [\ell ]}\) be the densities of \(\overline{\rho }_{i}\), then all the \(\mu _j\) are powers of \(\beta \ge 3\), and \(r(D(N, \mu _{j + 1})) \ge \alpha r(D(N, {\mu _{j}}/{\beta })) \ge 24 r(D(N, {\mu _{j}}/{\beta }))\) for \(j \in [\ell - 1]\). Moreover, \(\overline{\rho }_{i} \le \rho _{\mathcal {M}}\).
Proof
We first discuss how to build the curve \(\overline{\rho }\). The goal is, on the one hand, to guarantee that the image \(\{\overline{\lambda }_i\}_{i \in [\overline{m}]}\) of the curve \(\overline{\rho }\) satisfies that each \(\overline{\lambda }_i\) is a power of \(\beta \), and moreover, that the ranks corresponding to any two consecutive densities \(\overline{\lambda }_i\) and \(\overline{\lambda }_{i+1}\) are at least an \(\alpha \) factor apart (we formalize this below). On the other hand, we want \(\overline{\rho }\) to be as close as possible to \(\tilde{\rho }\), while satisfying \(\overline{\rho }\le \tilde{\rho }\). We next discuss how to achieve these.
Let \(\{\tilde{\lambda }_i\}_{i \in [\tilde{m}]}\) denote the densities (i.e., values of at least one) in the image of \(\tilde{\rho }\), and let \(\tau {:=}\max \{t>0 :\tilde{\rho }(t) \ge 1\}\). Let \(\rho ':\mathbb {R}_{>0} \rightarrow \mathbb {R}_{\ge 0}\) be the curve obtained from \(\tilde{\rho }\) by rounding down every density \(\tilde{\lambda }_i\) to the closest power of \(\beta \). That is,
Let \(\{\lambda '_i\}_{i \in [m']}\) denote the densities in the image of \(\rho '\), and note that by construction all the \(\lambda '_i\) are powers of \(\beta \). It thus remains to guarantee that the geometric rank increase property for any two consecutive densities is satisfied. In order to achieve this, we use the following definition. Given a rank-density curve \(\rho \) with image \(\lambda _1> \cdots > \lambda _m\), we define the function \(r^{\rho }_{\max }:\mathbb {R}_{\ge 0} \rightarrow \mathbb {R}_{\ge 0}\) by
For brevity, let \(r'_{\max }{:=}r^{\rho '}_{\max }\) denote the \(r_{\max }\) function corresponding to the curve \(\rho '\).
We then build \(\overline{\rho }\) as follows. First, set \(\overline{\lambda }_1 = \lambda '_1\) and \(i=1\). Then, while \(\rho '(\alpha \cdot r'_{\max } (\overline{\lambda }_i)) \ge 1\), set \(\overline{\lambda }_{i+1} = \rho '(\alpha \cdot r'_{\max } (\overline{\lambda }_i))\) and update \(i=i+1\). Let \(\overline{\Lambda }{:=}\{\overline{\lambda }_i\}_{i \in [\overline{m}]}\) denote the densities selected by the above procedure. We define \(\overline{\rho }\) to be the curve obtained from \(\rho '\) by further rounding down densities \(\lambda '_i\) in the image of \(\rho '\) to the closest \(\overline{\lambda }\in \overline{\Lambda }\). More precisely,
where \(\overline{\tau } {:=}\overline{r}_{\max }(\overline{\lambda }_{\overline{m}}) \le \tau \).
We have that \(\overline{\rho }\) is an \((\alpha ,\beta )\)-approximation of \(\tilde{\rho }\) by construction due to Lemma 4.3, because \(\rho '\) is a \((1,\beta )\)-approximation of \(\tilde{\rho }\), and \(\overline{\rho }\) is an \((\alpha ,1)\)-approximation of \(\rho '\). Combining this with the fact that \(\tilde{\rho }\) is an \((\alpha ,\beta )\)-approximation of \(\rho _{\mathcal {M}}\) proves property (i).
Next, let \(\tilde{r}_{\max }{:=}r^{\tilde{\rho }}_{\max }\) and \(\overline{r}_{\max }{:=}r_{\max }^{\overline{\rho }}\) denote the \(r_{\max }\) functions corresponding to the curves \(\tilde{\rho }\) and \(\overline{\rho }\), respectively. We show the following.
Claim 6.3
If \(|\overline{\Lambda }|\ge 5\), then for any five consecutive densities \(\overline{\lambda }_{i}, \ldots , \overline{\lambda }_{i+4} \in \overline{\Lambda }\), we have
Let \(\kappa {:=}r(D(N,{\overline{\lambda }_i}/{\beta }))\). Note that if \(\kappa \le \alpha \), then Claim 6.3 holds because \(r'_{\max }(\overline{\lambda }_{i+2})\ge 1\) by construction. Now assume \(\kappa > \alpha \). We claim that in this case we have
which holds due to the following. Because \(\tilde{\rho }\) is an \((\alpha ,\beta )\)-approximation of \(\rho \), we have
Applying this inequality with \(t={\kappa }/{\alpha }\), we get
where the second inequality follows from \(\kappa {:=}r(D({N,\overline{\lambda }_i}/{\beta }))\). Finally, Eq. (14) is an immediate implication of Eq. (15). The claim now follows due to
where the first inequality is due to (14), the first equality follows by construction of \(\rho '\) and the fact that \({\overline{\lambda }_i}/{\beta ^2}\) is a power of \(\beta \) (because \(\overline{\lambda }_i\) is a power of \(\beta \) and \(\overline{\lambda }_i \ge \beta ^4 \overline{\lambda }_{i+4} \ge \beta ^4\)), and the second inequality again follows since by construction \({\overline{\lambda }_i}/{\beta ^2} \ge \overline{\lambda }_{i+2}\).
Using Claim 6.3, we can further upper bound \(r(D(N,\frac{\overline{\lambda }_i}{\beta }))\) as follows
where the first inequality is due to Claim 6.3, the equality holds because \(r'_{\max } (\overline{\lambda }) = \overline{r}_{\max } (\overline{\lambda })\) for every \(\overline{\lambda }\in \overline{\Lambda }\) by construction, the second inequality holds because by construction \(\overline{r}_{\max } (\overline{\lambda }_{j+1}) \ge \alpha \cdot \overline{r}_{\max } (\overline{\lambda }_{j})\) for any two consecutive densities \(\overline{\lambda }_{j}, \overline{\lambda }_{j+1} \in \overline{\Lambda }\), and the last inequality follows from \(\overline{\rho }\le \rho \).
We now build the curves \(\{\overline{\rho }_i\}_{i \in [4]}\) as follows. For \(i\in [4]\), let
If \(\overline{\Lambda }_i = \emptyset \), let \(\overline{\rho }_i\) be (by convention) the curve given by \(\overline{\rho }_i (t) = 1\) for \(0 < t \le \overline{\tau }\), and \(\overline{\rho }_i (t)=0\) for \(t > \overline{\tau }\). Otherwise, we define \(\overline{\rho }_i\) to be the curve obtained from \(\overline{\rho }\) by further rounding down densities \(\overline{\lambda }\in \overline{\Lambda }\) to the closest density in \(\overline{\Lambda }_i\). More precisely,
where \(\overline{\tau }_i {:=}\overline{r}_{\max } (\min \{\mu : \mu \in \overline{\Lambda }_i\})\).
To see property (ii), note that by construction we have \(\cup _{i \in [4]} \overline{\Lambda }_{i} = \overline{\Lambda }\). By setting \(\overline{r}_{\max } (\overline{\lambda }_{i}) = 0\) for all \(i \le 0\) by convention, we get
Finally, to see property (iii), first note that by construction we have \(\overline{\rho }_i \le \overline{\rho }\le \rho ' \le \rho _{\mathcal {M}}\) for each \(i \in [4]\), and moreover, all the curves \(\overline{\rho }_i\) consist of densities that are powers of \(\beta \). In addition, the geometric rank increase property holds trivially if \(|\overline{\Lambda }| \le 4\). Otherwise, it follows directly from Eq. (16) and the fact that \(\alpha \ge 24\). \(\square \)
We now show how Theorems 6.1 and 6.2 combined imply Theorem 3.5.
Proof of Theorem 3.5
Given an \((\alpha , \beta )\)-approximation \(\tilde{\rho }\) of \(\rho _{\mathcal {M}}\), first run the procedure from Theorem 6.2 to get curves \(\overline{\rho },\overline{\rho }_1,\overline{\rho }_2,\overline{\rho }_3,\overline{\rho }_4\). Then choose an index \(i\in [4]\) uniformly at random and run the procedure from Theorem 6.1 on \(\overline{\rho }_i\) to get an independent set with expected weight at least
where the last inequality uses Lemma 3.4 and the fact that \(\overline{\rho }\) is an \((\alpha ^{2}, \beta ^{2})\)-approximation of \(\rho _{\mathcal {M}}\). \(\square \)
Thus, to show Theorem 3.5, it remains to prove Theorem 6.1.
6.1 Proof of Theorem 6.1
Throughout this section we use the notation and assumptions from Theorem 6.1.
We prove the theorem in two steps. First, we show that after observing a sample set S, we can build a chain \(\bigoplus _{i = 1}^{m} \mathcal {M}_{i}\) of \(\left. \mathcal {M}\right| _{N {\setminus } S}\) satisfying certain properties with at least constant probability. Then we show that, given such a chain, there is a procedure returning an independent set I of \(\mathcal {M}\) with \(\mathbb {E}[w(I)] = \Omega (F(\overline{\rho }))\), leading to the desired result. We start by discussing the former claim.
Given a sample set \(S \subseteq N\), we build a chain of matroids as follows. For \(i \in [m]\) let
where \(D(S, {\overline{\lambda }_{0}}/{\beta }) = \emptyset \) by convention.
In addition, for every \(i \in [m]\) let \(\overline{N}_{i} {:=}D(N, \overline{\lambda }_{i})\), and define \(\overline{\Lambda }{:=}\{i \in [m] :r(\overline{N}_{i}) \ge 24, \ \overline{\lambda }_{i} \ge \beta \}\). Note that \(\overline{\Lambda }\) and the \(\overline{N}_{i}\)’s do not depend on the sample set S. Moreover, from the assumptions of Theorem 6.1 it follows that \(\overline{\Lambda }\supseteq [m] {\setminus } \{1,m\}\). The next result shows that with constant probability, the sample set S is such that for each \(i \in \overline{\Lambda }\), the set \(N_i\) contains a subset \(U_i\) of large rank and density; more precisely, \(r(U_i) \ge \Omega (r(\overline{N}_i))\) and \({|U_i|}/{r(U_i)} \ge \Omega (\overline{\lambda }_i)\).
Lemma 6.4
Let \(S \sim B(N, {1}/{2})\), and let \(N_i\), \(\overline{N}_i\), and \(\overline{\Lambda }\) be as defined above. Then, with probability at least \({1}/{3}\), every \(N_{i}\) with \(i \in \overline{\Lambda }\) contains \(\overline{\lambda }_{i}\) disjoint independent sets \(I_{1}, \ldots , I_{\overline{\lambda }_{i}}\) such that \(\sum _{j=1}^{\overline{\lambda }_{i}} |I_{j}| \ge ({1}/{24}) \overline{\lambda }_{i} r(\overline{N}_{i})\).
Proof
Let \(B_{i} \subseteq \overline{N}_{i}\) denote the union of (any) \(\overline{\lambda }_{i}\) disjoint bases of \(\left. \mathcal {M}\right| _{\overline{N}_{i}}\). We then say that a sample set S is good if it satisfies
The motivation for the above definition is that any good sample set S leads to a matroid chain (as defined in Eq. (18)) that satisfies the properties claimed in the lemma. To see this, note that for every \(i \in \overline{\Lambda }\cap [m - 1]\) we have
where the last inequality holds since \(r(D(N, \overline{\lambda }_{i + 1})) \ge 24 r(D(N, {\overline{\lambda }_{i}}/{\beta }))\) by the assumptions of Theorem 6.1. Moreover, since \(D(S, {\overline{\lambda }_{0}}/{\beta }) = \emptyset \) by convention, the same bound holds for \(i = 0\), i.e., \(|{\textrm{span}}\left( D\left( S, {\overline{\lambda }_{0}}/{\beta }\right) \right) \cap B_{1}| \le ({\overline{\lambda }_{1}}/{24}) r(\overline{N}_{1})\). Thus, for any \(i \in \overline{\Lambda }\) we get
Finally, since \(B_{i}\) is a union of \(\overline{\lambda }_{i}\) disjoint bases, then \(N_{i} \cap B_{i}\) is a union of \(\overline{\lambda }_{i}\) disjoint independent sets (contained in \(N_{i}\)), and hence the claim follows.
Thus, in order to prove the lemma, it is only left to show that \(\Pr \left[ S {\text { is good}}\right] \ge {1}/{3}\). To see this, first observe that since for all \(i\in \overline{\Lambda }\) it holds that \(\overline{\lambda }_{i} = \beta ^j\) for some \(j \ge 1\), and \(\beta \ge 3\), then
Then, by applying (5) from Theorem 5.1 to every \(\left. \mathcal {M}\right| _{B_{i}}\) with \(i \in \overline{\Lambda }\), we can bound the probability of S not being good because of index \(i\in \overline{\Lambda }\) by
where the first inequality follows from Eq. (20), the second from Theorem 5.1 and the fact that for all \(i \in \overline{\Lambda }\) the matroid \(\left. \mathcal {M}\right| _{B_{i}}\) contains 3h disjoint bases with \(h=\lfloor {\overline{\lambda }_{i}}/{3} \rfloor \ge 1\), and the last inequality holds since \(|B_{i}| = \overline{\lambda }_{i} r(\overline{N}_{i})\) and \(\overline{\lambda }_{i} \ge \beta \ge 3\) for all \(i \in \overline{\Lambda }\).
We can now upper bound the probability of S not being good by using a union bound over \(i\in \overline{\Lambda }\) and \(r(\overline{N}_{i}) \ge 24\) for \(i\in \overline{\Lambda }\) as follows:
Hence, \(\Pr \left[ S {\text { is good}}\right] \ge 1 - {2}/{3} = {1}/{3}\), as desired. \(\square \)
The second main ingredient in the proof is to show that the above result can be exploited algorithmically. More precisely, we prove the following.
Lemma 6.5
Let \(\mathcal {M}=(N,\mathcal {I})\) be a matroid with w-sampled weights that contains h disjoint independent sets \(I_{1}, \dots , I_{h}\) such that \(s {:=}({1}/{h}) \sum _{j = 1}^{h} |I_{j}| \ge 1\). Then there is a procedure that, when run on the RA-MSP subinstance given by \(\mathcal {M}\), and with only h given upfront, returns an independent set of \(\mathcal {M}\) with expected weight at least \(({s}/{2e}) \eta (h)\). This is still the case even if the elements of \(\mathcal {M}\) are revealed in adversarial (rather than uniformly random) order.
Proof
Suppose we run the online selection procedure (OSP) described in Algorithm 2 on the RA-MSP subinstance given by \(\mathcal {M}\), and with parameter h (as defined in the statement of Lemma 6.5) as input.

Suppose that OSP successfully completed its i-th iteration (where \(i \in \{0, \dots , r - 1\}\)) and is about to begin the \((i + 1)\)-st iteration. Let Z denote the set of elements seen so far and let I be the set of elements picked so far. Additionally, let \(T {:=}\bigcup _{j=1}^{h} I_{j}\) and \(J {:=}\{e \in N :I \cup \{e\} \in \mathcal {I}\}\). First, note that \(I \in \mathcal {I}\) by construction. Now, observe that
since \(I_{j}, I \in \mathcal {I}\) and \(|I| \le i\). Therefore \(|T \cap J| \ge (s - i)h\). Moreover, since the classical secretary algorithm was executed on i groups of h elements, we have \(|(T \cap Z) \cap J| \le ih\) and hence
Thus, if \(s \ge 2i + 1\), the number \(|(N\setminus Z) \cap J|\) of elements that Algorithm 2 can feed to the classical MSP algorithm in iteration \(i+1\) is at least h, which implies that it successfully completes its \(i + 1\) iteration; hence the total number of successful iterations is at least \(\lfloor {(s - 1)}/{2} \rfloor + 1 \ge \max \{1, \, s / 2\}\).
Finally, note that the expected weight of the element picked in each successful iteration is at least \(\eta (h) / e\). This follows since (by definition of \(\eta \)) the expected weight of the heaviest element in each successful iteration is \(\eta (h)\). Moreover, even though the arrival order is adversarial, due to the random assignment of the weights, the classical secretary algorithm is applied to h weights drawn uniformly at random from w. Since the classical secretary algorithm returns on expectation at least an e-fraction of the heaviest weight, the claimed factor of \(\eta (h) / e\) follows. Combining this with the lower bound on the number of successful iterations gives the desired result. \(\square \)
We can now combine Lemmas 6.4 and 6.5 to prove Theorem 6.1 as follows.
Proof of Theorem 6.1
Let \({\textrm{OSP}}(\mathcal {M},h)\) denote the online selection procedure from Lemma 6.5 (i.e., Algorithm 2). Additionally, for \(i\in [m]\), let \(r_{i}\) denote the coefficient of \(\eta (\overline{\lambda }_{i})\) in \(F(\overline{\rho })\). Hence, \(F(\overline{\rho }) = \sum _{i = 1}^{m} r_{i} \eta (\overline{\lambda }_{i})\). Consider the following algorithm: choose and execute one of the three branches presented below with probability \({12}/{15}\), \({2}/{15}\), and \({1}/{15}\), respectively.
-
(i)
Observe \(S \sim B(N, {1}/{2})\), construct the chain \(\bigoplus _{i = 1}^{m} \mathcal {M}_{i}\) as defined in (18), and run \({\textrm{OSP}}(\mathcal {M}_{i}, \overline{\lambda }_{i})\) for every \(i \in [m]\) (independently in parallel), returning all the picked elements.
-
(ii)
Run the classical secretary algorithm on \(\mathcal {M}\) and return the picked element (if any).
-
(iii)
Run \({\textrm{OSP}}(\mathcal {M}, 1)\) without observing anything and return all picked elements.
Suppose we execute branch (i). By Lemma 6.4, with probability at least \({1}/{3}\), every \(\mathcal {M}_{i}\) with \(i \in \overline{\Lambda }\) satisfies the conditions of Lemma 6.5 with parameters \(h = \overline{\lambda }_{i}\) and \(s = ({1}/{24}) r(\overline{N}_{i})\). Note that \(s \ge 1\) holds given that \(r(\overline{N}_{i}) \ge 24\) for all \(i \in \overline{\Lambda }\). As additionally all matroids in the chain form a direct sum, executing the first branch of the algorithm returns an independent set with expected weight, due to Lemma 6.5, of at least
where the inequality follows from \(\overline{\rho }\le \rho _{\mathcal {M}}\) and \(\overline{N}_{i} = D(N, \overline{\lambda }_{i})\) for every \(i \in [m]\).
Therefore, if \(i \in \overline{\Lambda }\), then the corresponding term \(r_{i} \eta (\overline{\lambda }_{i})\) in \(F(\overline{\rho })\) is accounted for by branch (i). Thus it only remains to consider \(i \in [m] {\setminus } \overline{\Lambda }\subseteq \{1,m\}\).
Assume first that \(1 \notin \overline{\Lambda }\). In this case, we must have \(r(\overline{N}_1)<24\). Since the expected weight yielded by running the classical secretary algorithm is at least \({\eta (|N|)}/{e}\), and \(\eta (|N|) \ge \eta (\overline{\lambda }_{1})\), then, by running branch (ii), the expected weight of the output set is at least
where the last inequality follows from \(r_1 \le r(\overline{N}_1) \le 23\).
Finally, assume that \(m \notin \overline{\Lambda }\). Then \(\overline{\lambda }_{m} = 1\), in which case running branch (iii) yields
where the first inequality holds by Lemma 6.5 with \(h = 1\) and \(s = r(N) \ge 1\), as any basis of \(\mathcal {M}\) is an independent set of rank r(N), and the second inequality holds because \(r(N) \ge r(\overline{N}_{m})\) and \(\overline{\lambda }_{m} = 1\).
The desired lower bound on the expected weight of the set returned by the algorithm now follows by combining the above results with the respective probabilities that each branch is executed. \(\square \)
Finally, we discuss that our main result (i.e., Theorem 1.2) still holds in the more general adversarial order with a sample setting, where we are allowed to sample a set \(S \subseteq N\) containing every element of N independently with probability \({1}/{2}\), and the remaining (non-sampled) elements arrive in adversarial order.
In order to see this, first note that the only place in the proof of Theorem 6.1 where we use that the non-sampled elements (i.e., \(N\setminus S\)) arrive in random order, is to argue that when running the classical secretary algorithm in branch (ii) we obtain an expected weight of at least \({w_{\max }}/{e}\). Indeed, branches (i) and (iii) rely on running the procedure from Lemma 6.5, whose guarantees hold in the case where the elements arrive in adversarial order. However, note that running the classical secretary procedure in the above adversarial order with a sample setting outputs an element with expected weight of at least \({w_{\max }}/{4}\). Indeed, the probability of selecting \(w_{\max }\) in the latter setting is at least the probability of the event that \(w_{\max }\) is not sampled and the second largest weight is; which occurs with probability \({1}/{4}\). Thus, Theorem 6.1 holds (up to possibly a slightly worse constant) in the adversarial order with a sample setting.
Next, observe that this implies that Theorem 3.5 also holds in the above setting (again, up to possibly a slightly worse constant). This follows because its proof relies on combining the procedures from Theorems 6.1 and 6.2, and the latter is completely oblivious to the arrival order of the elements.
Finally, note that the proof of Theorem 1.2 uses the procedure from Theorem 6.1 and the classical secretary algorithm. Because (as discussed above) both of these algorithms have very similar guarantees in the adversarial order with a sample setting to the ones shown in this paper for random order, the claim follows.
6.2 Full algorithm and adversarial order with a sample setting
We next summarize the full algorithm used to obtain Theorem 1.2. First, the algorithm executes one of the following two branches uniformly at random:
-
(1)
Run the classical secretary algorithm on N.
-
(2)
Run the following procedure:
-
Sample a set S (without selecting anything) containing every element of N with probability \({1}/{2}\) independently.
-
Define \(\tilde{\rho }\) to be the (288, 9)-downshift of \(\rho _{\left. \mathcal {M}\right| _{S}}\).
-
Run the procedure of Theorem 3.5 on the remaining non-sampled elements (i.e., on the matroid \(\left. \mathcal {M}\right| _{N \setminus S}\)) using as input the curve \(\tilde{\rho }\), and parameters \(\alpha = 288^2\) and \(\beta = 9^2\).
-
The procedure from Theorem 3.5 used above consists of two main steps: First, it runs the algorithm from Theorem 6.2 on the curve \(\tilde{\rho }\) with parameters \(\alpha = 288^2\) and \(\beta = 9^2\), to find well-structured curves \(\overline{\rho }_1, \overline{\rho }_2, \overline{\rho }_3\), and \(\overline{\rho }_4\). Then, it selects one \(\overline{\rho }_j\) out of these four curves uniformly at random, and runs the procedure from Theorem 6.1 on the matroid \(\left. \mathcal {M}\right| _{N \setminus S}\) using as input such \(\overline{\rho }_j\) and \(\beta = 9^2\). The latter algorithm consists of executing one of the following three branches with probability \({2}/{15}\), \({1}/{15}\), and \({12}/{15}\) respectively:
-
(2.i)
Run the classical secretary algorithm on \(N \setminus S\).
-
(2.ii)
Run \({\textrm{OSP}}(\left. \mathcal {M}\right| _{N \setminus S}, 1)\).
-
(2.iii)
Sample a set \(S'\) (without selecting anything) containing every element of \(N \setminus S\) with probability \({1}/{2}\) independently, construct the chain \(\bigoplus _{i = 1}^{m} \mathcal {M}_{i}\) as defined in (18) using \(\overline{\rho }_j\) and \(\beta = 9^2\) as input, and run \({\textrm{OSP}}(\mathcal {M}_{i}, \overline{\lambda }_{i})\) for every \(i \in [m]\).
The above completes the description of the full algorithm. In summary, the algorithm executes one of the following four options: branch (1) with probability \({1}/{2}\), branch (2.i) with probability \({1}/{15}\), branch (2.ii) with probability \({1}/{30}\), and branch (2.iii) with probability \({2}/{5}\). Hence, either the classical secretary algorithm is executed (on branch (1) or 2(i)), or the \({\textrm{OSP}}\) procedure from Algorithm 2 is executed (on branch (2.ii) or (2.iii)). The guarantees of the latter hold even if the elements arrive in adversarial order — see Lemma 6.5. Moreover, because of the random assignment setting, the standard guarantees for the classical secretary algorithm still hold under adversarial arrival order. Indeed, since weights are assigned uniformly at random to elements after the arrival order has been fixed, this setting is equivalent to the one where weights are assigned to elements adversarially but the arrival order is uniformly at random.
We now discuss how to adapt the above procedure and its analysis to the adversarial order with a sample setting. In this setting, the algorithm must specify a (possibly random) sampling probability \(p \in [0,1]\). The instance is then revealed to the algorithm in two phases. First, a random set S containing every element of N with probability p independently is provided to the algorithm. However, the algorithm is not allowed to select any element from S. In the second phase, the elements of \(N \setminus S\) are then revealed in an adversarial order.
Our modified algorithm works as follows. First, it chooses one of the four options with the respective (i.e., \({1}/{2}\), \({1}/{15}\), \({1}/{30}\), and \({2}/{5}\)) probabilities. Then:
-
If branch (1) is chosen, the algorithm sets \(p={1}/{e}\) for the sampling phase to obtain a sample set S. It then uses this sample set S to choose a threshold, and picks the first element arriving in \(N \setminus S\) with weight above the threshold (if any). As discussed above, since for the classical secretary problem the random assignment adversarial order setting is equivalent to the standard adversarial assignment random order setting, this procedure outputs an element of expected weight at least \({w_{\max }}/{e}\). Thus being equivalent to running branch (1) in the original algorithm.
-
If branch (2.i) is chosen, the algorithm sets \(p={(e+1)}/{2e}\) for the sampling phase to obtain a sample set S. This is equivalent to sampling first over N with probability \({1}/{2}\) as done in (2), and then sampling another \({1}/{e}\)-fraction over the remaining elements as done in (2.i). Let \(S'\) be a random subset of S obtained by subsampling each element of S with probability \({1}/{(e+1)}\) independently. Note that \(S'\) is then a random set containing each element of N with probability \({1}/{2e}\) independently. We then use \(S'\) to set a threshold, and select the first element arriving in \(N {\setminus } S\) with weight above the threshold (if any). This procedure is equivalent to running branch (2.i) in the original algorithm.
-
If branch (2.ii) is chosen, the algorithm sets \(p={1}/{2}\) to obtain a sample set S, and then runs \({\textrm{OSP}}(\left. \mathcal {M}\right| _{N \setminus S}, 1)\) on the remaining non-sampled elements. This does not have any impact on the uniform assignment of the weights w to the elements, since it only depends on the sampled elements S but not the weights of its elements. Thus, the RA-MSP subinstance given by \(\left. \mathcal {M}\right| _{N\setminus S}\) on which we use OSP indeed assigns weights of w uniformly at random to elements, as required. Since the guarantees of OSP hold under adversarial arrival order, this is equivalent to running branch (2.ii) in the original algorithm.
-
Finally, if branch (2.iii) is chosen, the algorithm sets \(p={3}/{4}\) to obtain a sample set \(\overline{S}\). Let \(S'\) be a random subset of \(\overline{S}\) obtained by subsampling each element of \(\overline{S}\) with probability \({2}/{3}\) independently. Note that \(S'\) is then a random set containing each element of N with probability \({1}/{2}\) independently, while \(\tilde{S} {:=}\overline{S} \setminus S'\) is random set containing each element of N with probability \({1}/{4}\). We then use \(S'\) to simulate the sample set S used in branch (2) of the original algorithm, and \(\tilde{S}\) to simulate the sample set \(S'\) used in branch (2.iii) of the original algorithm. As discussed in the case above, this construction does not have any impact on the uniform assignment of the weights w to the elements. Thus, the RA-MSP subinstance given by \(\left. \mathcal {M}\right| _{N\setminus \overline{S}}\) on which we use OSP indeed assigns weights of w uniformly at random to elements, as required. Finally, using that the guarantees of OSP hold under adversarial arrival order, it follows that this procedure is equivalent to running branch (2.iii) in the original algorithm.
Data availibility
Not applicable.
Code Availability
Not applicable.
Notes
A matroid \(\mathcal {M}\) is a pair \(\mathcal {M}= (N, \mathcal {I})\) where N is a finite set and \(\mathcal {I}\subseteq 2^{N}\) is a nonempty family satisfying: 1) if \(A \subseteq B\) and \(B \in \mathcal {I}\) then \(A \in \mathcal {I}\), and 2) if \(A, B \in \mathcal {I}\) and \(|B| > |A|\) then \(\exists e \in B {\setminus } A\) such that \(A \cup \{e\} \in \mathcal {I}\).
The rank function \(r :2^{N} \rightarrow \mathbb {Z}_{\ge 0}\) assigns to any set \(U \subseteq N\) the cardinality of a maximum cardinality independent set in U, i.e., \(r(U) {:=}\max \{|I| :I \subseteq U, I \in \mathcal {I}\}\).
References
Babaioff, M., Immorlica, N., Kleinberg, R.: Matroids, secretary problems, and online mechanisms. In: Symposium on Discrete Algorithms (SODA), pp. 434–443 (2007). https://doi.org/10.5555/1283383.1283429
Dynkin, E.B.: The optimum choice of the instant for stopping a markov process. Sov. Math. 4, 627–629 (1963)
Lachish, O.: \(O(\log \log ({{\rm rank}}))\) competitive ratio for the matroid secretary problem. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 326–335 (2014). https://doi.org/10.1109/FOCS.2014.42 . IEEE
Feldman, M., Svensson, O., Zenklusen, R.: A simple \(O(\log \log ({{\rm rank}}))\)-competitive algorithm for the matroid secretary problem. Math. Op. Res. 43(2), 638–650 (2018). https://doi.org/10.1287/moor.2017.0876
Babaioff, M., Immorlica, N., Kempe, D., Kleinberg, R.: Matroid secretary problems. J. ACM 65(6), 1–26 (2018). https://doi.org/10.1145/3212512
Korula, N., Pál, M.: Algorithms for secretary problems on graphs and hypergraphs. In: Proceedings of the 36th International Colloquium on Automata, Languages and Programming (ICALP), pp. 508–520 (2009). https://doi.org/10.1007/978-3-642-02930-1_42
Im, S., Wang, Y.: Secretary problems: Laminar matroid and interval scheduling. In: Proceedings of the 22nd Annual ACM -SIAM Symposium on Discrete Algorithms (SODA), pp. 1265–1274 (2011). https://doi.org/10.5555/2133036.2133132
Soto, J.A.: Matroid secretary problem in the random-assignment model. SIAM J. Comput. 42(1), 178–211 (2013). https://doi.org/10.1137/110852061
Jaillet, P., Soto, J.A., Zenklusen, R.: Advances on matroid secretary problems: Free order model and laminar case. In: International Conference on Integer Programming and Combinatorial Optimization (IPCO), pp. 254–265 (2013). https://doi.org/10.1007/978-3-642-36694-9_22 . Springer
Ma, T., Tang, B., Wang, Y.: The simulated greedy algorithm for several submodular matroid secretary problems. Theor. Comp. Sys. 58(4), 681–706 (2016). https://doi.org/10.1007/s00224-015-9642-4
Dimitrov, N.B., Plaxton, C.G.: Competitive weighted matching in transversal matroids. Algorithmica 62(1–2), 333–348 (2012). https://doi.org/10.1007/s00453-010-9457-2
Kesselheim, T., Radke, K., Tönnis, A., Vöcking, B.: An optimal online algorithm for weighted bipartite matching and extensions to combinatorial auctions. In: Proceedings of the 21st Annual European Symposium on Algorithms (ESA), pp. 589–600 (2013). https://doi.org/10.1007/978-3-642-40450-4_50
Dinitz, M., Kortsarz, G.: Matroid secretary for regular and decomposable matroids. SIAM J. Comput. 43(5), 1807–1830 (2014). https://doi.org/10.1137/13094030X
Oveis Gharan, S., Vondrák, J.: On variants of the matroid secretary problem. Algorithmica 67(4), 472–497 (2013). https://doi.org/10.1007/s00453-013-9795-y
Chakraborty, S., Lachish, O.: Improved competitive ratio for the matroid secretary problem. In: Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1702–1712 (2012). https://doi.org/10.1137/1.9781611973099.135
Karger, D.: Random sampling and greedy sparsification for matroid optimization problems. Math. Program. (1998). https://doi.org/10.1007/BF01585865
Acknowledgements
This project received funding from Swiss National Science Foundation grant 200021_184622 and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 817750).
Funding
Open access funding provided by Swiss Federal Institute of Technology Zurich This project received funding from Swiss National Science Foundation grant 200021_184622 and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 817750).
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the paper.
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
An extended abstract of this article appeared in the proceedings of Integer Programming and Combinatorial Optimization (IPCO) 2023.
Appendix A Omitted proofs from Sect. 5
Appendix A Omitted proofs from Sect. 5
The following result is used to show property (ii) in the proof of (5) in Theorem 5.1.
Lemma A.1
Let \(\mathcal {M}= (N, \mathcal {I})\) be a matroid, let \(\mathcal {M}_{h} = (N, \mathcal {I}_{h})\) denote its h-fold union for some \(h \in \mathbb {Z}_{\ge 1}\), and let \(r_{h}\) denote the rank function of \(\mathcal {M}_{h}\). Let \(Q \subseteq N\) and \(e \in N \setminus Q\) be such that \(r_{h}(Q \cup \{e\}) = r_{h}(Q)\). Then \(e \in {\textrm{span}}(D(Q, h))\), where D(Q, h) is as defined in Eq. (1).
Proof
Let r denote the rank function of \(\mathcal {M}\). By the matroid partitioning theorem of Nash-Williams, we have
Let \(B^{*} \subseteq Q\) be such that \(r_{h}(Q \cup \{e\}) = |(Q \cup \{e\}) {\setminus } B^{*}| + h r(B^{*})\). Note that \(e \in B^{*}\), as otherwise
where the first inequality follows from (A1); however, this contradicts \(r_{h}(Q \cup \{e\}) = r_{h}(Q)\). Additionally, note that \(r(B^{*} \setminus \{e\}) = r(B^{*})\), as otherwise
where the first inequality uses again (A1); this again contradicts \(r_{h}(Q \cup \{e\}) = r_{h}(Q)\). Moreover, by putting \(r_{h}(Q \cup \{e\}) = r_{h}(Q)\) together with \(e \in B^{*}\) and \(r(B^{*} {\setminus } \{e\}) = r(B^{*})\), we get:
This implies, due to (A1), that the set \(A^{*} {:=}B^{*} {\setminus } \{e\}\) is a minimizer of \(|Q{\setminus } A| + h r(A) = |Q| - (|A| - h r(A))\) over all \(A\subseteq Q\). Equivalently, \(A^*\) is therefore a maximizer of \(|A| - h r(A)\) over all \(A\subseteq Q\); hence, \(A^* \subseteq D(Q,h)\), because D(Q, h) is the unique maximal maximizer of the same expression. Recalling that \(e \in {\textrm{span}}(A^{*})\), we thus get \(e \in {\textrm{span}}(D(Q, h))\), as desired. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Santiago, R., Sergeev, I. & Zenklusen, R. Constant-competitiveness for random assignment Matroid secretary without knowing the Matroid. Math. Program. 210, 815–846 (2025). https://doi.org/10.1007/s10107-024-02177-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-024-02177-x