Keywords

1 Introduction

Boolean functions and pseudo-Boolean optimization methods are key analytical tools in a variety of theoretical and applicative scenarios [1,2,3]. Pseudo-Boolean functions are in fact set functions, taking real values over the subsets of a finite set, and their polynomial multilinear extension, or MLE for short, allows to deal with several discrete optimization problems in a continuous setting. This paperFootnote 1 proposes a novel approach enabling to include among such problems those where the objective function takes real values over collections of pair-wise disjoint subsets. In particular, these collections are evaluated by summing the values taken on their elements by a set function, and the corresponding combinatorial optimization problems are set packing and set partitioning.

For a ground set \(N=\{1,\ldots ,n\}\) and a family \(\mathcal F\subseteq 2^N=\{A:A\subseteq N\}\) of feasible subsets, the standard packing problem is to find a largest subfamily \(\mathcal F^*\subseteq \mathcal F\) whose members are pairwise disjoint, i.e. \(A\cap B=\emptyset \) for all \(A,B\in \mathcal F^*\). In addition, a set function \(w:\mathcal F\rightarrow \mathbb R_+\) may assign weights to feasible subsets, in which case optimality attains at those such subfamilies \(\mathcal F^*\) with maximum weight \(W(\mathcal F^*)=\sum _{A\in \mathcal F^*}w(A)\). If \(w(A)=1\) for all \(A\in \mathcal F\), then of course largest subfamilies have maximum weight. Accordingly, the proposed approach relies on using the MLE of set function w in order to evaluate families of fuzzy feasible subsets. In this way, the search for locally optimal subfamilies \(\mathcal F^*\) can be undertaken in a continuous domain.

In combinatorial optimization [5], set packing is very important from both theoretical and applicative perspectives. The problem considered in computational complexity is to approximate optimal solutions within a provable bounded factor by means of efficient algorithms. In particular, the concern is mostly with non-approximability results for k-set packing [6], where every feasible subset has size no greater than some \(k\ll n\) (and unit weight as above). Also, if feasible subsets have all size \(k=2\), then set packing reduces to maximal matching for a graph with N as vertex set, in which case there is an algorithm with polynomial running time whose output is an exact solution [7]. More generally, k-set packing with \(k>2\) is equivalent to vertex colouring in hypergraphs [8], and in the k-uniform and d-regular case every feasible subset \(A\in \mathcal F\) has size \(|A|=k\), while elements \(i\in N\) of the ground set are each contained in \(d>1\) feasible subsets, i.e. \(|\{A:i\in A\in \mathcal F\}|=d\).

As for applications, combinatorial auctions provide a main example: elements \(i\in N\) are items to be sold with the objective to maximize the revenue, accepting bids on any subset or bundle (or combination) \(A\in 2^N\). If the maximum received bid on each bundle is regarded as a weight, then optimization may be dealt with as a maximum-weight set packing problem [9]. Since the search space is exponentially large, time constraints often lead to use heuristics (rather than approximate algorithms, i.e. without worst-case guarantees) or simultaneous ascending auctions, where each item is sold independently and bids may be updated over a predetermined time period [10].

Throughout this work, set packing is approached by expanding traditional pseudo-Boolean optimization methods [1] into a novel near-Boolean one. Specifically, while the MLE of set functions is commonly employed to enlarge the domain of each of the n variables from \(\{0,1\}\) to [0, 1], the proposed technique relies on n variables ranging each in the \(2^{n-1}\)-set of extreme points of a unit simplex, and uses the MLE to include the continuum provided by the whole simplex. As usual, elements of the ground set or integers \(i\in N\) are the indices of the n variables. On the other hand, the n involved \(2^{n-1}-1\)-dimensional unit simplices \(\varDelta _i,i\in N\) have their \(2^{n-1}\) extreme points indexed by those (feasible) subsets where each \(i\in N\) is included. The resulting near-Boolean function takes values on the n-fold product \(\times _{i\in N}ex(\varDelta _i)\), where \(ex(\varDelta _i)\) is the \(2^{n-1}\)-set of extreme points of \(\varDelta _i\), while its MLE evaluates collections of fuzzy subsets of N or, equivalently, fuzzy subfamilies of (feasible) subsets. Such a MLE is precisely the objective function to be maximized through a gradient-based local search.

It can be mentioned that maximum-weight set packing may be tackled via constrained maximization of pseudo-Boolean function \(v:\{0,1\}^{|\mathcal F|}\rightarrow \mathbb R_+\) with \(v\left( x_{A_1},\ldots ,x_{A_{|\mathcal F|}}\right) =\sum _{1\le k\le |\mathcal F|}x_{A_k}w(A_k)\) s.t. \(A_k\cap A_l\ne \emptyset \Rightarrow x_{A_k}+x_{A_l}\le 1\) for all \(1\le k<l\le |\mathcal F|\), where \(x_A\in \{0,1\}\) for all \(A\in \mathcal F=\{A_1,\ldots A_{|\mathcal F|}\}\). In fact, a suitable \(|\mathcal F|\times |\mathcal F|\)-matrix M allows to replace v with \(xMx\simeq v\), where \(x=(x_{A_1},\ldots ,x_{A_{|\mathcal F|}})\). Then, a constrained maximizer \(x^*\) can be found by means of an heuristic, the corresponding solution being \(\mathcal F^*=\{A:x_A^*=1\}\). Such an approach [11] is very different from what is proposed here, in that v is an objective function of \(|\mathcal F|\) constrained Boolean variables, while the expanded MLE described above is a function of n unconstrained near-Boolean variables.

The following section comprehensively frames the case where \(\mathcal F=2^N\), applying to set partitioning. Section 3 shows next that by introducing the empty set \(\emptyset \) and all singletons \(\{i\}\in 2^N,i\in N\) into the family \(\mathcal F\subset 2^N\) of feasible subsets (with null weights \(w(\emptyset )=0=w(\{i\}\) if \(\{i\}\notin \mathcal F\)) the proposed method also applies to set packing. The gradient-based local search for set packing is detailed in Sect. 4, where a cost function \(c:\mathcal F^t\rightarrow \mathbb N\) (\(\mathcal F^t\subseteq \mathcal F\)) also enters the picture, in line with greedy approaches [12]. At any iteration \(t=0,1,\ldots \), the cost of including a still available feasible subset in the packing is the number of still available feasible subsets that the former intersects. Section 5 details near-Boolean functions, showing that there is a continuum of equivalent polynomials for MLE, with common degree and varying coefficients, and introduces the MLE of partition functions. Section 6 provides a novel modeling for coalition formation games, with N as player set. Finally, Sect. 7 contains the conclusions.

2 Set Partitioning: Full-Dimensional Case

The \(2^n\)-set \(\{0,1\}^n\) of vertices of the n-dimensional unit hypercube \([0,1]^n\) corresponds bijectively to power set \(2^N\), as characteristic functions \(\chi _A:N\rightarrow \{0,1\}\), \(A\in 2^N\) are defined by \(\chi _A(i)=1\) if \(i\in A\) and \(\chi _A(i)=0\) if \(i\in N\backslash A=A^c\). Also, zeta function \(\zeta :2^N\times 2^N\rightarrow \mathbb R\) is the element of the incidence algebra [13, 14] of Boolean lattice \((2^N,\cap ,\cup )\) defined by \(\zeta (A,B)=1\) if \(B\supseteq A\) and \(\zeta (A,B)=0\) if \(B\not \supseteq A\). The collection \(\{\zeta (A,\cdot ):A\in 2^N\}\) is a linear basis of the (free) vector space [13, p. 181] of real-valued functions w on \(2^N\), meaning that linear combination \(w(B)=\sum _{A\in 2^N}\mu ^w(A)\zeta (A,B)=\sum _{A\subseteq B}\mu ^w(A)\) for all \(B\in 2^N\) applies to any w, with Möbius inversion \(\mu ^w:2^N\rightarrow \mathbb R\) given by

$$\begin{aligned} \mu ^w(A)=\sum _{B\in 2^N}\mu (B,A)w(B)=\sum _{B\subseteq A}(-1)^{|A\backslash B|}w(B)=w(A)-\sum _{B\subset A}\mu ^w(B)\text{, } \end{aligned}$$

where \(\subset \) denotes strict inclusion while \(\mu :2^N\times 2^N\rightarrow \mathbb R\) is the Möbius function, i.e. the inverse of \(\zeta \) in the incidence algebra of \((2^N,\cap ,\cup )\), defined as follows:

$$\begin{aligned} \mu (B,A)=\left\{ \begin{array}{c}(-1)^{|A\backslash B|}\text { if }B\subseteq A\text{, }\\ 0\text { if }B\not \subseteq A\text {,}\end{array} \right. \text {for all }A,B\in 2^N\text{. } \end{aligned}$$

By means of this essential combinatorial “analog of the fundamental theorem of the calculus” [14], the MLE \(f^w:[0,1]^n\rightarrow \mathbb R\) of w takes values

$$\begin{aligned} f^w(\chi _B)=\sum _{A\in 2^N}\left( \prod _{i\in A}\chi _B(i)\right) \mu ^w(A)=\sum _{A\subseteq B}\mu ^w(A)=w(B)\text { on vertices, and } \end{aligned}$$
$$\begin{aligned} f^w(x)=\sum _{A\in 2^N}\left( \prod _{i\in A}x_i\right) \mu ^w(A) \end{aligned}$$
(1)

on any point \(x=(x_1,\ldots ,x_n)\in [0,1]^n\). Conventionally, \(\mathop {{\prod }}\limits _{i\in \emptyset } x_i:=1\) [1, p. 157].

Let \(2^N_i=\{A:i\in A\in 2^N\}=\left\{ A_1,\ldots ,A_{2^{n-1}}\right\} \) be the \(2^{n-1}\)-set of subsets containing each \(i\in N\). Unit simplex

$$\begin{aligned} \varDelta _i=\left\{ \left( q_i^{A_1},\ldots ,q_i^{A_{2^{n-1}}}\right) :q_i^{A_k}\ge 0\text { for }1\le k\le 2^{n-1},\sum _{1\le k\le 2^{n-1}}q_i^{A_k}=1\right\} \end{aligned}$$

has dimension \(2^{n-1}-1\) and generic point \(q_i\in \varDelta _i\).

Definition 1

A fuzzy cover \(\mathbf q \) specifies, for each \(i\in N\), a membership distribution over \(2^N_i\), i.e. \(\mathbf q =(q_1,\ldots ,q_n)\in \varDelta _N=\underset{1\le i\le n}{\times }\text { }\varDelta _i\).

Equivalently, \(\mathbf q =\left\{ q^A:\emptyset \ne A\in 2^N,q^A\in [0,1]^n\right\} \) is a \(2^n-1\)-set whose elements \(q^A=\left( q^A_1,\ldots ,q^A_n\right) \) are n-vectors corresponding to non-empty subsets \(A\in 2^N\) and specifying a membership \(q_i^A\) for each \(i\in N\), with \(q_i^A\in [0,1]\) if \(i\in A\) while \(q_i^A=0\) if \(i\in A^c\). Since fuzzy covers are collections of points in \([0,1]^n\) and the MLE \(f^w\) of w is meant precisely to evaluate such points, the global worth \(W(\mathbf q )\) of \(\mathbf q \in \varDelta _N\) is the sum over all \(q^A,A\in 2^N\) of \(f^w(q^A)\) as defined by (1). That is,

$$\begin{aligned} W(\mathbf q )=\sum _{A\in 2^N}f^w(q^A)=\sum _{A\in 2^N}\left[ \sum _{B\subseteq A}\left( \prod _{i\in B}q_i^A\right) \mu ^w(B)\right] \text{, } \end{aligned}$$
$$\begin{aligned} \text {or equivalently }W(\mathbf q )=\sum _{A\in 2^N}\left[ \sum _{B\supseteq A}\left( \prod _{i\in A}q_i^B\right) \right] \mu ^w(A)\text{. } \end{aligned}$$
(2)

Example: for \(N=\{1,2,3\}\), define w by \(w(\{1\})=w(\{2\})=w(\{3\})=0.2\), \(w(\{1,2\})=0.8\), \(w(\{1,3\})=0.3\), \(w(\{2,3\})=0.6\), \(w(N)=0.7\), and \(w(\emptyset )=0\). Membership distributions over \(2^N_i,i=1,2,3\) are \(q_1\in \varDelta _1,q_2\in \varDelta _2,q_3\in \varDelta _3\), with

$$\begin{aligned} q_1=\left( \begin{array}{c}q_1^1 \\ q_1^{12} \\ q_1^{13} \\ q_1^N\end{array}\right) \text {, } q_2=\left( \begin{array}{c}q_2^2 \\ q_2^{12} \\ q_2^{23} \\ q_2^N\end{array}\right) \text {, } q_3=\left( \begin{array}{c}q_3^3 \\ q_3^{13} \\ q_3^{23} \\ q_3^N\end{array}\right) \text{. } \end{aligned}$$

If \(\hat{q}_1^{12}=\hat{q}_2^{12}=1\), then any membership \(q_3\in \varDelta _3\) yields \(W(\hat{q}_1,\hat{q}_2,q_3)\)

$$\begin{aligned} =w(\{1,2\})+\left( q_3^3+q_3^{13}+q_3^{23}+q_3^N\right) \mu ^w(\{3\})=w(\{1,2\})+w(\{3\})=1\text{. } \end{aligned}$$

Therefore, there is a continuum of fuzzy covers achieving maximum worth, i.e. 1. In order to select the one \(\hat{\mathbf{q }}=(\hat{q}_1,\hat{q}_2,\hat{q}_3)\) where \(\hat{q}_3^3=1\), attention must be placed only on exact ones, defined hereafter.

For any fuzzy covers \(\mathbf q =\{q^A:\emptyset \ne A\in 2^N\}\) and \(\hat{\mathbf{q }}=\{\hat{q}^A:\emptyset \ne A\in 2^N\}\), define \(\hat{\mathbf{q }}\) to be a shrinking of \(\mathbf q \) if there is a subset A such that \(\sum _{i\in A}q_i^A>0\) and

$$\begin{aligned} \hat{q}^B_i= & {} \left\{ \begin{array}{c} q^B_i\text { if } B\not \subseteq A\\ 0\text { if }B=A\end{array}\right. \text { for all }B\in 2^N,i\in N\text{, }\\ \sum _{B\subset A}\hat{q}^B_i= & {} q^A_i+\sum _{B\subset A}q_i^B\text { for all }i\in A\text {.} \end{aligned}$$

In words, a shrinking reallocates the whole membership mass \(\sum _{i\in A}q_i^A>0\) from \(A\in 2^N\) to all proper subsets \(B\subset A\), involving all and only those elements \(i\in A\) with strictly positive membership \(q_i^A>0\).

Definition 2

Fuzzy cover \(\mathbf q \in \varDelta _N\) is exact if there is no shrinking \(\hat{\mathbf{q }}\) of q such that \(W(\mathbf q )=W(\hat{\mathbf{q }})\) for all set functions w.

Proposition 1

If \(\mathbf q \) is exact, then \(\left| \left\{ i:q_i^A>0\right\} \right| \in \{0,|A|\}\) for all \(A\in 2^N\).

Proof

For \(\emptyset \subset A^+(\mathbf q )=\left\{ i:q_i^A>0\right\} \subset A\in 2^N\), let \(\alpha =|A^+(\mathbf q )|>1\) and note that \(f^w(q^A)=\underset{B\subseteq A^+(\mathbf q )}{\sum }\left( \underset{i\in B}{\prod }q_i^A\right) \mu ^w(B)\). Let shrinking \(\hat{\mathbf{q }}\), with \(\hat{q}^{B'}=q^{B'}\) if \(B'\not \in 2^{A^+(\mathbf q )}\), satisfy conditions

$$\begin{aligned} \sum _{B\in 2^N_i\cap 2^{A^+(\mathbf q )}}\hat{q}_i^B=q_i^A+\sum _{B\in 2^N_i\cap 2^{A^+(\mathbf q )}}q_i^B\text { for all }i\in A^+(\mathbf q ),B\ne \emptyset \text {, and} \end{aligned}$$
$$\begin{aligned} \prod _{i\in B}\hat{q}_i^B=\prod _{i\in B}q_i^B+\prod _{i\in B}q_i^A\text { for all }B\in 2^{A^+(\mathbf q )},|B|>1\text{. } \end{aligned}$$

These \(2^\alpha -1\) equations with \(\sum _{1\le k\le \alpha }k\left( {\begin{array}{c}\alpha \\ k\end{array}}\right) >2^{\alpha }\) variables \(\hat{q}_i^B,\emptyset \ne B\subseteq A^+(\mathbf q )\) admit a continuum of solutions, each providing precisely a shrinking \(\hat{\mathbf{q }}\) where

$$\begin{aligned} \sum _{B\in 2^{A^+(\mathbf q )}}f^w(\hat{q}^B)=f^w(q^A)+\sum _{B\in 2^{A^+(\mathbf q )}}f^w(q^B)\Rightarrow W(\mathbf q )=W(\hat{\mathbf{q }})\text{. } \end{aligned}$$

This entails that q is not exact.    \(\square \)

Partitions of N are families \(P=\{A_1,\ldots ,A_{|P|}\}\subset 2^N\) of (non-empty) pairwise disjoint subsets called blocks [13], i.e. \(A_k\cap A_l=\emptyset ,1\le k<l\le |P|\), whose union satisfies \(A_1\cup \cdots \cup A_{|P|}=N\). Any P corresponds to the collection \(\{\chi _A:A\in P\}\) of those |P| hypercube vertices identified by the characteristic functions of its blocks. Partitions P can thus be regarded as \(\mathbf p \in \varDelta _N\) where \(p_i^A=1\) for all \(A\in P,i\in A\), i.e. exact fuzzy covers where each \(i\in N\) concentrates its whole membershisp on the unique \(A\in 2^N_i\) such that \(A\in P\).

Definition 3

Fuzzy partitions are exact fuzzy covers.

Objective function \(W:\varDelta _N\rightarrow \mathbb R\) given by expression (2) above includes among its extremizers (non-fuzzy) partitions. This expands a result from pseudo-Boolean optimization [1]. For all \(\mathbf q \in \varDelta _N,i\in N\), let \(\mathbf q =q_i|\mathbf q _{-i}\), with \(q_i\in \varDelta _i\) and \(\mathbf q _{-i}\in \varDelta _{N\backslash i}=\times _{j\in N\backslash i}\text { }\varDelta _j\). Then, for any w,

$$\begin{aligned} W(\mathbf q )= & {} \sum _{A\in 2^N_i}f^w(q^A)+\sum _{A'\in 2^N\backslash 2^N_i}f^w(q^{A'})\\= & {} \sum _{A\in 2^N_i}\sum _{B\subseteq A\backslash i}\left( \prod _{j\in B}q^A_j\right) \Big (q^A_i\mu ^w(B\cup i)+\mu ^w(B)\Big )\\+ & {} \sum _{A'\in 2^N\backslash 2^N_i}\sum _{B'\subseteq A'}\left( \prod _{j'\in B'}q^{A'}_{j'}\right) \mu ^w(B')\text{. } \end{aligned}$$

Now define

$$\begin{aligned} W_i(q_i|\mathbf q _{-i})= & {} \sum _{A\in 2^N_i}q^A_i\left[ \sum _{B\subseteq A\backslash i}\left( \prod _{j\in B}q^A_j\right) \mu ^w(B\cup i)\right] \text { and}\\ W_{-i}(\mathbf q _{-i})= & {} \sum _{A\in 2^N_i}\left[ \sum _{B\subseteq A\backslash i}\left( \prod _{j\in B}q^A_j\right) \mu ^w(B)\right] \\+ & {} \sum _{A'\in 2^N\backslash 2^N_i}\left[ \sum _{B'\subseteq A'}\left( \prod _{j'\in B'}q^{A'}_{j'}\right) \mu ^w(B')\right] \text {, yielding} \end{aligned}$$
$$\begin{aligned} W(\mathbf q )=W_i(q_i|\mathbf q _{-i})+W_{-i}(\mathbf q _{-i})\text {.} \end{aligned}$$
(3)

Proposition 2

For all \(\mathbf q \in \varDelta _N\), there are \(\underline{\mathbf{q }},\overline{\mathbf{q }}\in \varDelta _N\) such that:

  1. (i)

    \(W(\underline{\mathbf{q }})\le W(\mathbf q )\le W(\overline{\mathbf{q }})\), as well as

  2. (ii)

    \(\underline{q}_i,\overline{q}^i\in ex(\varDelta _i)\) for all \(i\in N\).

Proof

For \(i\in N,\mathbf q _{-i}\in \varDelta _{N\backslash i}\), define \(w_\mathbf{q _{-i}}:2^N_i\rightarrow \mathbb R\) by

$$\begin{aligned} w_\mathbf{q _{-i}}(A)=\sum _{B\subseteq A\backslash i}\left( \prod _{j\in B}q^A_j\right) \mu ^w(B\cup i)\text{. } \end{aligned}$$
(4)

Let \(\mathbb A^+_\mathbf{q _{-i}}=\arg \max w_\mathbf{q _{-i}}\) and \(\mathbb A^-_\mathbf{q _{-i}}=\arg \min w_\mathbf{q _{-i}}\), with \(\mathbb A^+_\mathbf{q _{-i}}\ne \emptyset \ne \mathbb A^-_\mathbf{q _{-i}}\) at all \(\mathbf q _{-i}\). Most importantly,

$$\begin{aligned} W_i(q_i|\mathbf q _i)=\sum _{A\in 2^N_i}\Big (q^A_i\cdot w_\mathbf{q _{-i}}(A)\Big )=\langle q_i,w_\mathbf{q _{-i}}\rangle \text{, } \end{aligned}$$
(5)

where \(\langle \cdot ,\cdot \rangle \) denotes scalar product. Thus for given membership distributions of all \(j\in N\backslash i\), global worth is affected by i’s membership distribution through a scalar product. In order to maximize (or minimize) W by suitably choosing \(q_i\) for given \(\mathbf q _{-i}\), the whole of i’s membership mass must be placed over \(\mathbb A^+_\mathbf{q _{-i}}\) (or \(\mathbb A^-_\mathbf{q _{-i}}\)), anyhow. Hence there are precisely \(|\mathbb A^+_\mathbf{q _{-i}}|>0\) (or \(|\mathbb A^-_\mathbf{q _{-i}}|>0\)) available extreme points of \(\varDelta _i\). The following procedure selects (arbitrarily) one of them.

RoundUp \((w,\mathbf q )\)

Initialize: Set \(t=0\) and \(\mathbf q (0)=\mathbf q \).

Loop: While there is some \(i\in N\) with \(q_i(t)\not \in ex(\varDelta _i)\), set \(t=t+1\) and:

  1. (a)

    select some \(A^*\in \mathbb A^+_\mathbf{q _{-i}(t)}\),

  2. (b)

    define, for all \(j\in N,A\in 2^N\),

    $$\begin{aligned} q^A_j(t)= \left\{ \begin{array}{c}q_j^A(t-1)\text { if }j\ne i \\ 1\text { if }j=i\text { and } A=A^* \\ 0\text { otherwise}\end{array}\right. \text{. } \end{aligned}$$

Output: Set \(\overline{\mathbf{q }}=\mathbf q (t)\).

Every change \(q_i^A(t-1)\ne q_i^A(t)=1\) (for any \(i\in N,A\in 2^N_i\)) induces a non-decreasing variation \(W(\mathbf q (t))-W(\mathbf q (t-1))\ge 0\). Hence, the sought \(\overline{\mathbf{q }}\) is provided in at most n iterations. Analogously, replacing \(\mathbb A^+_\mathbf{q _{-i}}\) with \(\mathbb A^-_\mathbf{q _{-i}}\) yields the sought minimizer \(\underline{\mathbf{q }}\) (see also [1, p. 163]).    \(\square \)

Remark 1

For \(i\in N,A\in 2^N_i\), if all \(j\in A\backslash i\ne \emptyset \) satisfy \(q_j^A=1\), then (4) yields \(w_\mathbf{q _{-i}}(A)=w(A)-w(A\backslash i)\), while \(w_\mathbf{q _{-i}}(\{i\})=w(\{i\})\) regardless of \(\mathbf q _{-i}\).

Corollary 1

Some partition P satisfies \(W(\mathbf p )\ge W(\mathbf q )\) for all \(\mathbf q \in \varDelta _N\), with \(W(\mathbf p )=\sum _{A\in P}w(A)\).

Proof

Follows from Propositions 1 and 2, with the above notation associating \(\mathbf p \in \varDelta _N\) to partition P.    \(\square \)

Defining global maximizers is clearly immediate.

Definition 4

Fuzzy partition \(\hat{\mathbf{q }}\in \varDelta _N\) is a global maximizer if \(W(\hat{\mathbf{q }})\ge W(\mathbf q )\) for all \(\mathbf q \in \varDelta _N\).

As for local maximizers, consider a vector \(\omega =(\omega _1,\ldots ,\omega _n)\in \mathbb R^n_{++}\) of strictly positive weights, with \(\omega _N=\sum _{j\in N}\omega _j\), and focus on the (Nash) equilibrium [15] of the game with elements \(i\in N\) as players, each strategically choosing its membership distribution \(q_i\in \varDelta _i\) while being rewarded with fraction \(\frac{\omega _i}{\omega _N}W(q_1,\ldots ,q_n)\) of the global worth attained at any strategy profile \((q_1,\ldots ,q_n)=\mathbf q \in \varDelta _N\).

Definition 5

Fuzzy partition \(\hat{\mathbf{q }}\in \varDelta _N\) is a local maximizer if for all \(q_i\in \varDelta _i\) and all \(i\in N\) inequality \(W_i(\hat{q}_i|\hat{\mathbf{q }}_{-i})\ge W_i(q_i|\hat{\mathbf{q }}_{-i})\) holds (see expression (3)).

This definition of local maximizer entails that the neighborhood \(\mathcal N(\mathbf q )\subset \varDelta _N\) of any \(\mathbf q \in \varDelta _N\) is

$$\begin{aligned} \mathcal {N}(\mathbf q )=\underset{i\in N}{\bigcup }\Big \{\tilde{\mathbf{q }}:\tilde{\mathbf{q }}=\tilde{q}_i|\mathbf q _{-i},\tilde{q}_i\in {\varDelta }_i\Big \} . \end{aligned}$$

Definition 6

The (iA)-derivative of W at \(\mathbf q \in \varDelta _N\) is

$$\begin{aligned} \begin{array}{c} \partial W(\mathbf q )/\partial q^A_i=W(\overline{\mathbf{q }}(i,A))-W(\underline{\mathbf{q }}(i,A))\\ =W_i\Big (\overline{q}_i(i,A)|\overline{\mathbf{q }}_{-i}(i,A)\Big )-W_i\Big (\underline{q}_i(i,A)|\underline{\mathbf{q }}_{-i}(i,A)\Big )\text{, } \end{array} \end{aligned}$$

with \(\overline{\mathbf{q }}(i,A)=\Big (\overline{q}_1(i,A),\ldots ,\overline{q}_n(i,A)\Big )\) given by

$$\begin{aligned} \overline{q}_j^B(i,A)=\left\{ \begin{array}{c} q_j^B\text { for all }j\in N\backslash i,B\in 2^N_j \\ 1\text { for }j=i,B=A\\ 0\text { for }j=i,B\ne A \end{array}\right. \text{, } \end{aligned}$$

and \(\underline{\mathbf{q }}(i,A)=\Big (\underline{q}_1(i,A),\ldots ,\underline{q}_n(i,A)\Big )\) given by

$$\begin{aligned} \underline{q}_j^B(i,A)=\left\{ \begin{array}{c} q_j^B\text { for all }j\in N\backslash i,B\in 2^N_j \\ 0\text { for }j=i\text { and all }B\in 2^N_i \end{array}\right. \text{, } \end{aligned}$$

thus \(\nabla W(\mathbf q )=\{\partial W(\mathbf q )/\partial q^A_i:i\in N,A\in 2^N_i\}\in \mathbb R^{n2^{n-1}}\) is the (full) gradient of W at \(\mathbf q \). The i-gradient \(\nabla _iW(\mathbf q )\in \mathbb R^{2^{n-1}}\) of W at \(\mathbf q =q_i|\mathbf q _{-i}\) is set function \(\nabla _iW(\mathbf q ):2^N_i\rightarrow \mathbb R\) defined by \(\nabla _iW(\mathbf q )(A)=w_\mathbf{q _{-i}}(A)\) for all \(A\in 2^N_i\), where \(w_\mathbf{q _{-i}}\) is given by expression (4).

Remark 2

Membership distribution \(\underline{q}_i(i,A)\) is the null one: its \(2^{n-1}\) entries are all 0, hence \(\underline{q}_i(i,A)\not \in \varDelta _i\).

It is now possible to search for a local maximizer partition \(\mathbf p ^*\) from given fuzzy partition q as initial candidate solution, and while maintaining the whole search within the continuum of fuzzy partitions. This idea may be specified in alternative ways yielding different local search methods. One possibility is the following.

LocalSearch \((w,\mathbf q )\)

Initialize: Set \(t=0\) and \(\mathbf q (0)=\mathbf q \), with requirement \(|\{i:q_i^A>0\}|\in \{0,|A|\}\) for all \(A\in 2^N\) (i.e. q is exact).

Loop 1: While \(0<\sum _{i\in A}q^A_i(t)<|A|\) for some \(A\in 2^N\), set \(t=t+1\) and

  1. (a)

    select a \(A^*(t)\in 2^N\) satisfying

    $$\begin{aligned} \frac{1}{|A^*(t)|}\sum _{i\in A^*(t)}w_\mathbf{q _{-i}(t-1)}(A^*(t))\ge \frac{1}{|B|}\sum _{j\in B}w_\mathbf{q _{-j}(t-1)}(B) \end{aligned}$$

    for all \(B\in 2^N\) such that \(0<\sum _{i\in B}q^B_j(t)<|B|\),

  2. (b)

    for \(i\in A^*(t)\) and \(A\in 2^N_i\), define \(q_i^A(t)= \left\{ \begin{array}{c}1\text { if }A=A^*(t)\text{, }\\ 0\text { if }A\ne A^*(t)\text{, }\end{array}\right. \)

  3. (c)

    for \(j\in N\backslash A^*(t)\) and \(A\in 2^N_j\) with \(A\cap A^*(t)=\emptyset \), define

    $$\begin{aligned} q^A_j(t)=q_j^A(t-1) +\left( w(A)\sum _{\underset{B\cap A^*(t)\ne \emptyset }{B\in 2^N_j}}q_j^B(t-1)\right) \left( \sum _{\underset{B'\cap A^*(t)=\emptyset }{B'\in 2^N_j}}w(B')\right) ^{-1} \end{aligned}$$
  4. (d)

    for \(j\in N\backslash A^*(t)\) and \(A\in 2^N_j\) with \(A\cap A^*(t)\ne \emptyset \), define \(q^A_j(t)=0\).

Loop 2: While \(q_i^A(t)=1,|A|>1\) for some \(i\in N\) and \(w(A)<w(\{i\})+w(A\backslash i)\), set \(t=t+1\) and define:

$$\begin{aligned} q^{\hat{A}}_i(t)= & {} \left\{ \begin{array}{c}1\text { if }|\hat{A}|=1\\ 0\text { otherwise}\end{array}\right. \text { for all }\hat{A}\in 2^N_i\text{, }\\ q^{B}_j(t)= & {} \left\{ \begin{array}{c}1\text { if }B=A\backslash i\\ 0\text { otherwise}\end{array}\right. \text { for all }j\in A\backslash i,B\in 2^N_j\text{, }\\ q^{\hat{B}}_{j'}(t)= & {} q^{\hat{B}}_{j'}(t-1)\text { for all }j'\in A^c,\hat{B}\in 2^N_{j'}\text{. } \end{aligned}$$

Output: Set \(\mathbf q ^*=\mathbf q (t)\).

Both RoundUp and LocalSearch yield a sequence \(\mathbf q (0),\ldots ,\mathbf q (t^*)=\mathbf q ^*\) where \(q_i^*\in ex(\varDelta _i)\) for all \(i\in N\). In the former at the end of each iteration t the novel \(\mathbf q (t)\in \mathcal N(\mathbf q (t-1))\) is in the neighborhood of its predecessor. In the latter \(\mathbf q (t)\not \in \mathcal N(\mathbf q (t-1))\) in general, as in \(|P|\le n\) iterations of Loop 1 a partition \(\{A^*(1),\ldots ,A^*(|P|)\}=P\) is generated. For \(t=1,\ldots ,|P|\), selected subsets \(A^*(t)\in 2^N\) are any of those where the average over members \(i\in A^*(t)\) of \((i,A^*(t))\)-derivatives \(\partial W(\mathbf q (t-1))/\partial q^{A^*(t)}_i(t-1)\) is maximal. Once a block \(A^*(t)\) is selected, then lines (c) and (d) make all elements \(j\in N\backslash A^*(t)\) redistribute the entire membership mass currently placed on subsets \(A'\in 2^N_j\) with non-empty intersection \(A'\cap A^*(t)\ne \emptyset \) over those remaining \(A\in 2^N_j\) such that, conversely, \(A\cap A^*(t)=\emptyset \). The redistribution is such that each of these latter gets a fraction \(w(A)/\sum _{B\in 2^N_j:B\cap A^*(t)=\emptyset }w(B)\) of the newly freed membership mass \(\sum _{A'\in 2^N_j:A'\cap A^*(t)\ne \emptyset }q_j^{A'}(t-1)\). The subsequent Loop 2 checks whether the partition generated by Loop 1 may be improved by extracting some elements from existing blocks and putting them in singleton blocks of the final output. In the limit, set function w may be such that for some element \(i\in N\) global worth decreases when the element joins any subset \(A\in 2^N_i,|A|>1\), that is to say \(w(A)-w(A\backslash i)-w(\{i\})=\sum _{B\in 2^A\backslash 2^{A\backslash i}:|B|>1}\mu ^w(B)<0\).

Proposition 3

LocalSearch \((W,\mathbf q )\) outputs a local maximizer \(\mathbf q ^*\).

Proof

It is plain that the output is a partition P or, with the notation of Corollary 1 above, \(\mathbf q ^*=\mathbf p \). Accordingly, any element \(i\in N\) is either in a singleton block \(\{i\}\in P\) or else in a block \(A\in P,i\in A\) such that \(|A|>1\). In the former case, any membership reallocation deviating from \(p_i^{\{i\}}=1\), given memberships \(p_j,j\in N\backslash i\), yields a cover (fuzzy or not) where global worth is the same as at \(\mathbf p \), because \(\prod _{j\in B\backslash i}p_j^B=0\) for all \(B\in 2_i^N\backslash \{i\}\) (see Example 1 above). In the latter case, any membership reallocation \(q_i\) deviating from \(p_i^A=1\) (given memberhips \(p_j,j\in N\backslash i\)) yields a cover which is best seen by distinguishing between \(2_i^N\backslash A\) and A. Also recall that \(w(A)-w(A\backslash i)=\sum _{B\in 2^A\backslash 2^{A\backslash i}}\mu ^w(B)\). Again, all membership mass \(\sum _{B\in 2_i^N\backslash A}q_i^B>0\) simply collapses on singleton \(\{i\}\) because \(\prod _{j\in B\backslash i}p_j^B=0\) for all \(B\in 2_i^N\backslash A\). Hence \(W(\mathbf p )-W(q_i|\mathbf p _{-i})=w(A)-w(\{i\})+\)

$$\begin{aligned} -\left( q_i^A\sum _{B\in 2^A\backslash 2^{A\backslash i}:|B|>1}\mu ^w(B)+\sum _{B'\in 2^{A\backslash i}}\mu ^w(B')\right) \end{aligned}$$
$$\begin{aligned} =\left( p_i^A-q_i^A\right) \sum _{B\in 2^A\backslash 2^{A\backslash i}:|B|>1}\mu ^w(B)\text{. } \end{aligned}$$

Now assume that p is not a local maximizer, i.e. \(W(\mathbf p )-W(q_i|\mathbf p _{-i})<0\). Since \(p_i^A-q_i^A>0\) (because \(p_i^A=1\) and \(q_i\in \varDelta _i\) is a deviation from \(p_i\)), then

$$\begin{aligned} \sum _{B\in 2^A\backslash 2^{A\backslash i}:|B|>1}\mu ^w(B)=w(A)-w(A\backslash i)-w(\{i\})<0\text{. } \end{aligned}$$

Hence p cannot be the output of Second Loop.    \(\square \)

In local search methods, the chosen initial candidate solution determines what neighborhoods shall be visited. The range of the objective function in a neighborhood is a set of real values. In a neighborhood \(\mathcal N(\mathbf p )\) of a \(\mathbf p \in \varDelta _N\) or partition P only those \(\sum _{A\in P:|A|>1}|A|\) elements \(i\in A\) in non-singleton blocks \(A\in P,|A|>1\) can modify global worth by reallocating their membership. In view of (the proof of) Proposition 3, the only admissible variations obtain by deviating from \(p_i^A=1\) with an alternative membership distribution \(q_i^A\in [0,1)\), with \(W(q_i|\mathbf p _{-i})-W(\mathbf p )\) equal to \((q_i^A-1)\sum _{B\in 2^A\backslash 2^{A\backslash i},|B|>1}\mu ^w(B)+(1-q_i^A)w(\{i\})\). Hence, choosing partitions as initial candidate solutions of LocalSearch is evidently poor. A sensible choice should conversely allow the search to explore different neighborhoods where the objective function may range widely. A simplest example of such an initial candidate solution is \(q_i^A=2^{1-n}\) for all \(A\in 2^N_i\) and all \(i\in N\), i.e. the uniform distribution. On the other hand, the input of local search algorithms is commonly desired to be close to a global optimum, i.e. a maximizer in the present setting. This translates here into the idea of defining the input by means of set function w. In this view, consider \(q_i^A=w(A)/\sum _{B\in 2^N_i}w(B)\), yielding \(\frac{q_i^A}{q_i^B}=\frac{w(A)}{w(B)}=\frac{q^A_j}{q^B_j}\) for all \(A,B\in 2^N_i\cap 2^N_j\) and \(i,j\in N\). Finally note that with a suitable initial candidate solution, the search may be restricted to explore only a maximum number of fuzzy partitions, thereby containing the computational burden. In fact, if \(\mathbf q (0)\) is the finest partition \(\{\{1\},\ldots ,\{n\}\}\) or \(q_i^{\{i\}}(0)=1\) for all \(i\in N\), then the search does not explore any neighborhood at all, and such an input coincides with the output. More reasonably, let \(\mathbb A_\mathbf{q }^{\max }=\{A_1,\ldots ,A_k\}\) denote the collection of \(\supseteq \)-maximal subsets where input memberships are strictly positive. That is, \(q_i^{A_{k'}}>0\) for all \(i\in A_{k'},1\le k'\le k\) as well as \(q_i^B=0\) for all \(B\in 2^N\backslash \left( 2^{A_1}\cup \cdots \cup 2^{A_k}\right) \) and all \(i\in B\). Then, the output shall be a partition P each of whose blocks \(A\in P\) satisfies \(A\subseteq A_{k'}\) for some \(1\le k'\le k\). Hence, by suitably choosing the input, LocalSearch outputs a partition with no less than some minimum desired number of blocks.

Example: for \(N=\{1,2,3\}\), let \(w(\emptyset )=0\), \(w(\{1\})=1=w(\{3\})\), \(w(\{2\})=2\), \(w(\{1,3\})=3\), \(w(\{1,2\})=4=w(\{2,3\})\), \(w(N)=2\), with Möbius inversion \(\mu ^w(\{i\})=w(\{i\}),i\in N\), \(\mu ^w(\{1,3\})=1=\mu ^w(\{1,2\})=\mu ^w(\{2,3\})\) and \(\mu ^w(N)=-5\). With input \(\mathbf q (0)\) given by the uniform distribution \(q_i^A(0)=\frac{1}{4}\) for all \(i\in N\) and all \(A\in 2^N_i\), in the first iteration \(t=1\) the situation concerning \(w_\mathbf{q _{-i}}(A),i\in N,A\in 2^N_i\) is

$$\begin{aligned} \hat{w}_\mathbf{q _{-1}}(\{1\})=1,\text { }w_\mathbf{q _{-1}}(\{1,2\})=\frac{5}{4},\text { }w_\mathbf{q _{-1}}(\{1,3\})=\frac{5}{4}&\text { and}&w_\mathbf{q _{-1}}(N)=\frac{19}{16}\text{, }\\ \hat{w}_\mathbf{q _{-2}}(\{2\})=2,\text { }w_\mathbf{q _{-2}}(\{1,2\})=\frac{9}{4},\text { }w_\mathbf{q _{-2}}(\{2,3\})=\frac{9}{4}&\text {and}&w_\mathbf{q _{-2}}(N)=\frac{35}{16}\text{, }\\ \hat{w}_\mathbf{q _{-3}}(\{3\})=1,\text { }w_\mathbf{q _{-3}}(\{1,3\})=\frac{5}{4},\text { }w_\mathbf{q _{-3}}(\{2,3\})=\frac{5}{4}&\text {and }&w_\mathbf{q _{-3}}(N)=\frac{19}{16}\text{. }\\ \end{aligned}$$

In view of line (a), the selected block at the first iteration is \(A^*(1)=\{2\}\). According to lines (b)–(d), \(q_2^2(1)=1\), \(q_1^1(1)=\frac{3}{8}\) and \(q_1^{13}(1)=\frac{5}{8}\) as well as \(q_3^3(1)=\frac{3}{8}\) and \(q_3^{13}(1)=\frac{5}{8}\). In the second iteration \(t=2\),

$$\begin{aligned} w_\mathbf{q _{-1}}(\{1\})=1&\text { and}&w_\mathbf{q _{-1}}(\{1,3\})=\frac{13}{8}\text{, }\\ w_\mathbf{q _{-3}}(\{3\})=1&\text {and }&w_\mathbf{q _{-3}}(\{1,3\})=\frac{13}{8}\text{, } \end{aligned}$$

and thus the selected block at the second iteration is \(A^*(2)=\{1,3\}\). In fact, \(\mathcal F^*=\{\{1,3\},\{2\}\}\) is an optimal solution, and thus the second loop does not produce any change.

3 Set Packing: Lower-Dimensional Case

If \(\mathcal F\subset 2^N\), then \(2^N_i\cap \mathcal F\ne \emptyset \) for every \(i\in N\), otherwise the problem reduces to packing the proper subset \(N\backslash \{i:2^N_i\cap \mathcal F=\emptyset \}\). As outlined in Sect. 1, without additional notation simply let \(\{\emptyset \}\in \mathcal F\ni \{i\}\) for all \(i\in N\) with null weights \(w(\emptyset )=0=w(\{i\})\) if \(\{i\}\) is not an element of the original family of feasible subsets. Thus \((\mathcal F,\supseteq )\) is a poset (partially ordered set) with bottom \(\emptyset \), and weight function \(w:\mathcal F\rightarrow \mathbb R_+\) has well-defined Möbius inversion \(\mu ^w:\mathcal F\rightarrow \mathbb R\) [14]. Now memberships \(q_i,i\in N\) are distributions over \(\mathcal F_i=2^N_i\cap \mathcal F=\{A_1,\ldots ,A_{|\mathcal F_i|}\}\), with lower-dimensional (i.e. \(|\mathcal F_i|\)-dimensional) unit simplices

$$\begin{aligned} \bar{\varDelta }_i=\left\{ \left( q_i^{A_1},\ldots ,q_i^{A_{|\mathcal {F}_{i}|}}\right) :q_i^{A_k}\ge 0\text { for } 1\le k\le |\mathcal {F}_{i}|,\sum _{1\le k\le |\mathcal {F}_{i}|}q_i^{A_k}=1\right\} \end{aligned}$$

and corresponding fuzzy covers \(\mathbf q \in \bar{\varDelta }_{N}=\times _{1\le i\le n}\) \(\bar{\varDelta }_{i}\). Hence a fuzzy cover maximally consists of \(|\mathcal {F}|-1\) points in the unit n-dimensional hypercube. Accordingly, \([0,1]^n\) may be replaced with \(\mathcal C(\mathcal F)=co(\{\chi _A:A\in \mathcal {F}\})\subset [0,1]^n\), i.e. the convex hull [16] of characteristic functions corresponding to feasible subsets. Recursively (with \(w(\emptyset )=0\)), Möbius inversion \(\mu ^w:\mathcal F\rightarrow \mathbb R\) is

$$\begin{aligned} \mu ^w(A)=w(A)-\sum _{B\in \mathcal F:B\subset A}\mu ^w(B)\text{, } \end{aligned}$$

while the MLE \(f^w:\mathcal C(\mathcal F)\rightarrow \mathbb R\) of w is

$$\begin{aligned} f^w(q^A)=\sum _{B\in \mathcal F\cap 2^A}\left( \prod _{i\in B}q_{i}^A\right) \mu ^w(B)\text{. } \end{aligned}$$

Therefore, every fuzzy cover \(\mathbf q \in \bar{\varDelta }_N\) has global worth

$$\begin{aligned} W(\mathbf q )=\sum _{A\in \mathcal F}\sum _{B\in \mathcal F\cap 2^A}\left( \prod _{i\in B}q_i^A\right) \mu ^w(B)\text{. } \end{aligned}$$

For all \(i\in N,q_i\in \bar{\varDelta }_i\), and \(\mathbf q _{-i}\in \bar{\varDelta }_{N\backslash i}=\underset{j\in N\backslash i}{\times }\bar{\varDelta }_j\), let

$$\begin{aligned} W_i(q_i|\mathbf q _{-i})= & {} \sum _{A\in \mathcal {F}_{i}}q^A_i\left[ \sum _{B\in \mathcal F_i\cap 2^A}\left( \prod _{j\in B\backslash i}q^A_j\right) \mu ^w(B)\right] \text{, }\\ W_{-i}(\mathbf q _{-i})= & {} \sum _{A\in \mathcal {F}_{i}}\left[ \sum _{B\in \mathcal F\cap 2^{A\backslash i}}\left( \prod _{j\in B}q^A_j\right) \mu ^w(B)\right] \\+ & {} \sum _{A'\in \mathcal F\backslash \mathcal {F}_{i}}\left[ \sum _{B'\in \mathcal F\cap 2^{A'}}\left( \prod _{j'\in B'}q^{A'}_{j'}\right) \mu ^w(B')\right] \text{, } \end{aligned}$$

yielding again

$$\begin{aligned} W(\mathbf q )=W_i(q_i|\mathbf q _{-i})+W_{-i}(\mathbf q _{-i})\text{. } \end{aligned}$$
(6)

From (4) above, \(w_\mathbf{q _{-i}}:\mathcal F_i\rightarrow \mathbb R\) now is

$$\begin{aligned} w_\mathbf{q _{-i}}(A)=\sum _{B\in \mathcal F_i\cap 2^A}\left( \prod _{j\in B\backslash i}q^A_j\right) \mu ^w(B)\text{. } \end{aligned}$$
(7)

For each \(i\in N\), denote by \(ex(\bar{\varDelta }_i)\) the set of \(|\mathcal F_i|\) extreme points of simplex \(\bar{\varDelta }_i\). Like in the full-dimensional case, at any fuzzy cover \(\hat{\mathbf{q }}\in \bar{\varDelta }_N\) every \(i\in N\) such that \(\hat{q}_i\not \in ex(\bar{\varDelta }_i)\) may deviate by concentrating its whole membership on some \(A\in \mathcal F_i\) such that \(w_{\hat{\mathbf{q }}_{-i}}(A)\ge w_{\hat{\mathbf{q }}_{-i}}(B)\) for all \(B\in \mathcal F_i\). This yields a non-decreasing variation \(W(q_i|\hat{\mathbf{q }}_{-i})\ge W(\hat{\mathbf{q }})\) in global worth, with \(q_i\in ex(\bar{\varDelta }_i)\). When all n elements do so, one after the other while updating \(w_\mathbf{q _{-i}(t)}\) as in RoundUp above, i.e. \(t=0,1,\ldots \), then the final \(\mathbf q =(q_1,\ldots ,q_n)\) satisfies \(\mathbf q \in \underset{i\in N}{\times }ex(\bar{\varDelta }_i)\). Yet cases \(\mathcal F\subset 2^N\) and \(\mathcal F=2^N\) are different in terms of exactness. Specifically, consider any \(\emptyset \ne A\in \mathcal F\) such that \(|\{i:q_i^A=1\}|\not \in \{0,|A|\}\) or \(A_\mathbf{q }^+=\{i:q_i^A=1\}\subset A\), with \(f^w(q^A)=\sum _{B\in \mathcal F\cap 2^{A_\mathbf{q }^+}}\mu ^w(B)\). Then, \(\mathcal F\cap 2^{A_\mathbf{q }^+}\) may admit no shrinking yielding an exact fuzzy cover with same global worth as (non-exact) \(\mathbf q \).

Proposition 4

The values taken on exact fuzzy covers do not saturate the whole range of \(W:\bar{\varDelta }_{N}\rightarrow \mathbb R_+\).

Proof

For \(N=\{1,2,3,4\}\) and \(\mathcal F=\{N,\{4\},\{1,2\},\{1,3\},\{2,3\}\}\), let \(w(N)=3\), \(w(\{4\})=2\), \(w(\{i,j\})=1\) for \(1\le i<j\le 3\). Define \(\mathbf q =(q_1,\ldots ,q_4)\) by \(q^{\{4\}}_4=1=q^N_i,i=1,2,3\), with non-exactness \(|\{i:q_i^N>0\}|=3<4=|N|\). As

$$\begin{aligned} W(\mathbf q )=w(\{4\})+\sum _{1\le i<j\le 3}w(\{i,j\})=2+1+1+1 \end{aligned}$$

and \(A^+_\mathbf{q }=\{1,2,3\}\), if distributions \(\hat{q}_1,\hat{q}_2,\hat{q}_3\) place membership only over feasible subsets \(B\in \mathcal F\cap 2^{A_\mathbf{q }^+}\), then global worth is \(W(\hat{q}_1,\hat{q}_2,\hat{q}_3,q_4)<W(\mathbf q )\).    \(\square \)

Therefore, an arbitrary search for optimal fuzzy covers may provide a maximizer (global or local) which is not reducible to any feasible solution of the original set packing problem. Such feasible solutions are partitions P all of whose blocks are feasible, and where singleton blocks with worth 0 are not included in the packing. In fact, similarly to the full-dimensional case, fairly simple conditions may be shown to be sufficient for a partition to be a local maximizer.

Definition 7

Any \(\hat{q}_i|\hat{\mathbf{q }}_{-i}=\hat{\mathbf{q }}\in \bar{\varDelta }_N\) is a local maximizer of \(W:\bar{\varDelta }_N\rightarrow \mathbb R_+\) if \(W_i(\hat{q}_i|\hat{\mathbf{q }}_{-i})\ge W_i(q_i|\hat{\mathbf{q }}_{-i})\) for all \(i\in N\) and all \(q_i\in \bar{\varDelta }_i\) (see expression (6) above).

The neighborhood \(\mathcal N(\mathbf q )\subset \bar{\varDelta }_N\) of \(\mathbf q \in \bar{\varDelta }_N\) thus is

Any partition P with \(A\in \mathcal F\) for each block \(A\in P\) clearly has associated p such that \(\mathbf p \in \underset{i\in N}{\times }ex(\bar{\varDelta }_i)\subset \bar{\varDelta }_N\).

Proposition 5

Any partition P with associated p such that \(\mathbf p \in \bar{\varDelta }_N\) is a local maximizer if for all \(A\in P\)

$$\begin{aligned} w(A)\ge w(\{i\})+\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})\text{. } \end{aligned}$$

Proof

Firstly note that for all blocks \(A\in P\), if any, such that \(|A|=1\) there is nothing to prove, as the summation reduces to \(w(\emptyset )=0\), and thus there only remains \(w(\{i\})\ge w(\{i\})\). Accordingly, let \(A\in P\) and \(|A|>1\). For every \(i\in A\), any membership reallocation \(q_i\in \bar{\varDelta }_i\) deviating from \(p_i\) (i.e. \(p_i^A=1\)), given memberships \(\mathbf p _{-i}\) of other elements \(j\in N\backslash i\) (i.e. \(\bar{\varDelta }_j\ni p_j^{A'}=1\) for all \(A'\in P\) and all \(j\in A'\)), yields \(\mathbf q =(q_i|\mathbf p _{-i})\in \bar{\varDelta }_N\) which is best analyzed by distinguishing between \(\mathcal F_i\backslash A\) and A. In particular,

$$\begin{aligned} w(A)=w(\{i\})+\sum _{\underset{|B|>1}{B\in \mathcal F_i\cap 2^A}}\mu ^w(B)+\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})\text{. } \end{aligned}$$

Again, all membership mass \(\sum _{B\in \mathcal F_i\backslash A}q_i^B>0\) collapses on singleton \(\{i\}\), because \(\mathop {{\prod }}\limits _{i'\in B\backslash i}p_{i'}^B=0\) for all \(B\in \mathcal F_i\backslash A\) by the definition of \(\mathbf p _{-i}\). Thus,

$$\begin{aligned} W(\mathbf p )-W(q_i|\mathbf p _{-i})=w(A)-&w(\{i\}) -\left( q_i^A\sum _{\underset{|B|>1}{B\in \mathcal F_i\cap 2^A}}\mu ^w(B)+\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})\right) \\&=\left( p_i^A-q_i^A\right) \sum _{\underset{|B|>1}{B\in \mathcal F_i\cap 2^A}}\mu ^w(B)\text{. } \end{aligned}$$

Now assume that \(\mathbf p \) is not a local maximizer, i.e. \(W(\mathbf p )-W(q_i|\mathbf p _{-i})<0\). Since \(p_i^A-q_i^A>0\) because \(p_i^A=1\) and \(q_i\in \bar{\varDelta }_i\) is a deviation from \(p_i\), then

$$\begin{aligned} \sum _{\underset{|B|>1}{B\in \mathcal F_i\cap 2^A}}\mu ^w(B)=w(A)-w(\{i\})-\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})<0\text{, } \end{aligned}$$

and this contradicts precisely the premise \(w(A)\ge w(\{i\})+\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})\) for all \(A\in P\) and \(i\in A\), thereby completing the proof.    \(\square \)

4 Local Search

As outlined in Sect. 1, when \(\mathcal F\subset 2^N\) every feasible subset \(A\in \mathcal F\) is to be considered not only in terms of its weight w(A), but also in terms of the number of feasible subsets \(B\ne A\) such that \(B\cap A\ne \emptyset \), as these latter are automatically excluded from the packing if A is included. Formally, for any \(\mathcal F'\subseteq \mathcal F\) a cost function \(c:\mathcal F'\rightarrow \mathbb N\) counts the number of still available feasible subsets that each \(A\in \mathcal F'\) intersects, itself included. That is to say, the cost of including A in the packing is \(c(A)=|\{B:B\in \mathcal F',B\cap A\ne \emptyset \}|\in \{1,\ldots ,|\mathcal F'|\}\). Also, like for LocalSearch in Sect. 2, those feasible subsets \(A^*(t)\) to be included are selected iteratively, i.e. one after the other for \(t=0,1,\ldots \), entailing that \(\mathcal F^{t-1}=\{B:B\in \mathcal F,B\cap A^*(t')=\emptyset ,0\le t'<t\}\) is the family of feasible subsets still available for each iteration t. Accordingly, the underlying poset function \(\hat{w}^t:\mathcal F^{t-1}\rightarrow \mathbb R_+\) used at t takes into account both weights and costs \(c:\mathcal F^{t-1}\rightarrow \mathbb N\) by means of ratio \(\hat{w}^t(A)=\frac{w(A)}{c(A)}\). Of course, \(\emptyset \in \mathcal F^t\) for all t, thus Möbius inversion \(\mu ^{\hat{w}^t}:\mathcal F^{t-1}\rightarrow \mathbb R\) is well-defined. In particular, for any \(\mathbf q \in \bar{\varDelta }_N\), the analog of expression (7) above now is \(\hat{w}^t_\mathbf{q _{-i}}:\mathcal F^{t-1}_i\rightarrow \mathbb R\) given by

$$\begin{aligned} \hat{w}^t_\mathbf{q _{-i}}(A)=\sum _{B\in \mathcal F_i^{t-1}\cap 2^A}\left( \prod _{j\in B\backslash i}q^A_j\right) \mu ^{\hat{w}^t}(B)\text{. } \end{aligned}$$

Given this additional notation, LocalSearch from Sect. 2 can be modified as follows.

LS-WithCost \((\hat{w},\mathbf q )\)

Initialize: Set \(t=0\) and \(\mathbf q (0)=\mathbf q \), with requirement \(|\{i:q_i^A>0\}|\in \{0,|A|\}\) for all \(A\in \mathcal F\). Also set \(\mathcal F^0=\mathcal F\).

Loop 1: While \(0<\sum _{i\in A}q^A_i(t)<|A|\) for a \(A\in \mathcal F^t\), set \(t=t+1\) and:

  1. (a)

    select a \(A^*(t)\in \mathcal F^{t-1}\) satisfying

    $$\begin{aligned} \min _{i\in A^*(t)}\text { }\hat{w}^t_\mathbf{q _{-i}(t-1)}(A^*(t))\ge \min _{j\in B}\text { }\hat{w}^t_\mathbf{q _{-j}(t-1)}(B) \end{aligned}$$

    for all \(B\in \mathcal F^{t-1}\) such that \(0<\sum _{i\in B}q^B_j(t)<|B|\),

  2. (b)

    for \(i\in A^*(t)\) and \(A\in \mathcal F_i^{t-1}\), define \(q_i^A(t)=\left\{ \begin{array}{c}1\text { if }A=A^*(t)\text{, }\\ 0\text { if }A\ne A^*(t)\text{, }\end{array}\right. \)

  3. (c)

    for \(j\in N\backslash A^*(t)\) and \(A\in \mathcal F_j^{t-1}\) with \(A\cap A^*(t)=\emptyset \), define

    $$\begin{aligned} q^A_j(t)=q_j^A(t-1)+\left( \hat{w}^t(A)\sum _{\underset{B\cap A^*(t)\ne \emptyset }{B\in \mathcal F_j^{t-1}}}q_j^B(t-1)\right) \left( \sum _{\underset{B'\cap A^*(t)=\emptyset }{B'\in \mathcal F_j^{t-1}}}\hat{w}^t(B')\right) ^{-1} \end{aligned}$$
  4. (d)

    for \(j\in N\backslash A^*(t)\) and \(A\in \mathcal F_j^{t-1}\) with \(A\cap A^*(t)\ne \emptyset \), define \(q^A_j(t)=0\).

  5. (e)

    define \(\mathcal F^t=\{B:B\in \mathcal F^{t-1},B\cap A^*(t)=\emptyset \}\).

Loop 2: While \(q_i^A(t)=1,|A|>1\) for a \(i\in N\) and

$$\begin{aligned} w(A)<w(\{i\})+\sum _{\hat{B}\in \mathcal F\cap 2^{A\backslash i}}\mu ^w(\hat{B})\text{, } \end{aligned}$$

set \(t=t+1\) and define:

$$\begin{aligned} q^{\hat{A}}_i(t)= & {} \left\{ \begin{array}{c}1\text { if }|\hat{A}|=1\\ 0\text { otherwise}\end{array}\right. \text { for all }\hat{A}\in \mathcal F_i\text{, }\\ q^{B}_j(t)= & {} \left\{ \begin{array}{c}1\text { if }B=A\backslash i\\ 0\text { otherwise}\end{array}\right. \text { for all }j\in A\backslash i,B\in \mathcal F_j\text{, }\\ q^{\hat{B}}_{j'}(t)= & {} q^{\hat{B}}_{j'}(t-1)\text { for all }j'\in A^c,\hat{B}\in \mathcal F_{j'}\text{. } \end{aligned}$$

Output: Set \(\mathbf q ^*=\mathbf q (t)\).

Both LocalSearch and LS-WithCost generate in \(|P|\le n\) iterations of Loop 1 a partition \(\{A^*(1),\ldots ,A^*(|P|)\}=P\). Selected blocks \(A^*(t)\in \mathcal F^{t-1}\), \(1\le t\le |P|\) now are any of those feasible subsets where the minimum over elements \(i\in A^*(t)\) of \((i,A^*(t))\)-derivatives \(\hat{w}^t_\mathbf{q _{-i}}(A^*(t))\) is maximal. The following Loop 2 again checks whether the partition generated by Loop 1 may be improved by extracting some elements from existing blocks and putting them in singleton blocks of the final output.

Proposition 6

LS-WithCost \((W,\mathbf q )\) outputs a local maximizer \(\mathbf q ^*\).

Proof

Follows from Proposition 5 since Loop 2 deals with w, not with \(\hat{w}\).

Concerning input \(\mathbf q =\mathbf q (0)\), consider setting

$$\begin{aligned} q_i^A=\frac{\frac{w(A)}{|\{A':A'\in \mathcal F,A'\cap A\ne \emptyset \}|}}{\sum _{B\in \mathcal F_i}\frac{w(B)}{|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}}\text { for all }A\in \mathcal F_i,i\in N\text {, entailing} \end{aligned}$$
$$\begin{aligned} \frac{q_i^A}{q_i^B}=\frac{w(A)|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}{w(B)|\{A':A'\in \mathcal F,A'\cap A\ne \emptyset \}|}=\frac{q_j^A}{q_j^B} \end{aligned}$$

for all \(A,B\in \mathcal F_i\cap \mathcal F_j\) and \(i,j\in N\).

Evidently, Loop 1 may take exactly the same form as in LocalSearch, that is with selected blocks \(A^*(t)\in \mathcal F,t=1,\ldots ,|P|\) of the generated partition P being any of those feasible subsets where the average, rather than the minimum, over elements \(i\in A^*(t)\) of \((i,A^*(t))\)-derivatives \(\hat{w}^t_\mathbf{q _{-i}}(A^*(t))\) is maximal. This possibility may be useful in those settings where set packing appears in its weighted version, while using the minimum in place of the sum seems interesting for k-uniform (and d-regular) set packing problems (see Sect. 1). In fact, for the k-uniform case Möbius inversion is \(\mu ^{\hat{w}^t}(A)=\frac{1}{|\{B:B\in \mathcal F^{t-1},B\cap A\ne \emptyset \}|}\) if \(|A|=k\) and \(\mu ^{\hat{w}^t}(A)=0\) if \(|A|\in \{0,1\}\) for all \(A\in \mathcal F^{t-1}\) (recall the convention \(\{\emptyset \}\in \mathcal F\ni \{i\}\) for all \(i\in N\)). It is also plain that in k-uniform set packing Loop 2 is ineffective.

Example: let \(N=\{1,2,3\}\) and \(\mathcal F=\{\{1\},\{3\},\{1,2\},\{2,3\},\{1,2,3\}\}\), with \(w(\{1\})=1\), \(w(\{3\})=2=w(\{1,2\})\), \(w(\{2,3\})=3\), \(w(N)=3.5\). Accordingly,

$$\begin{aligned} \sum _{B\in \mathcal F_1}\frac{w(B)}{|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}= & {} \frac{1}{3}+\frac{2}{4}+\frac{3.5}{5}\text{, }\\ \sum _{B\in \mathcal F_2}\frac{w(B)}{|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}= & {} \frac{2}{4}+\frac{3}{4}+\frac{3.5}{5}\text{, }\\ \sum _{B\in \mathcal F_3}\frac{w(B)}{|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}= & {} \frac{2}{3}+\frac{3}{4}+\frac{3.5}{5}\text{, } \end{aligned}$$

and thus input \(\mathbf q (0)\) defined above, i.e.

$$\begin{aligned} q_i^A(0)=\frac{\frac{w(A)}{|\{A':A'\in \mathcal F,A'\cap A\ne \emptyset \}|}}{\sum _{B\in \mathcal F_i}\frac{w(B)}{|\{B':B'\in \mathcal F,B'\cap B\ne \emptyset \}|}}\text { for all }A\in \mathcal F_i,i\in N\text {, yields} \end{aligned}$$
$$\begin{aligned} \left( \begin{array}{c}q_1^1(0)\simeq 0.217 \\ q_1^{12}(0)\simeq 0.326 \\ q_1^N(0)\simeq 0.457\end{array}\right) \text {, } \left( \begin{array}{c}q_2^{12}(0)\simeq 0.256 \\ q_2^{23}(0)\simeq 0.385 \\ q_2^N(0)\simeq 0.359\end{array}\right) \text {, } \left( \begin{array}{c}q_3^3(0)\simeq 0.315 \\ q_3^{23}(0)\simeq 0.354 \\ q_3^N(0)\simeq 0.331\end{array}\right) \text{. } \end{aligned}$$

The enlarged \(\mathcal F\) clearly is \(\mathcal F=\{\emptyset ,\{1\},\{2\},\{3\},\{1,2\},\{2,3\},\{1,2,3\}\}\), and in the first iteration \(t=1\) the situation concerning \(\hat{w}^t_\mathbf{q _{-i}}(A),i\in N,A\in \mathcal F_i\) is

$$\begin{aligned} \hat{w}^1_\mathbf{q _{-1}}(\{1\})=\frac{1}{3}&\text {and }\hat{w}^1_\mathbf{q _{-3}}(\{3\})=\frac{2}{3}\text { (and of course }\hat{w}^1_\mathbf{q _{-2}}(\{2\})=0\text {),}\\ \hat{w}^1_\mathbf{q _{-1}}(\{1,2\})= & {} 0.256\left( \frac{2}{4}-\frac{1}{3}\right) +\frac{1}{3}=0.376\text{, }\\ \hat{w}^1_\mathbf{q _{-2}}(\{1,2\})= & {} 0.326\left( \frac{2}{4}-\frac{1}{3}\right) =0.054\text{, }\\ \hat{w}^1_\mathbf{q _{-2}}(\{2,3\})= & {} 0.354\left( \frac{3}{4}-\frac{2}{3}\right) =0.030\text{, }\\ \hat{w}^1_\mathbf{q _{-3}}(\{2,3\})= & {} 0.385\left( \frac{3}{4}-\frac{2}{3}\right) +\frac{2}{3}=0.699\text{, } \end{aligned}$$
$$\begin{aligned} \hat{w}^1_\mathbf{q _{-1}}(N)= & {} (0.359\times 0.331)\left[ \frac{3.5}{5}-\left( \frac{2}{4}-\frac{1}{3}\right) -\left( \frac{3}{4}-\frac{2}{3}\right) -\frac{1}{3}-\frac{2}{3}\right] \\+ & {} 0.359\left( \frac{2}{4}-\frac{1}{3}\right) +\frac{1}{3}=0.328\text{, }\\ \hat{w}^1_\mathbf{q _{-2}}(N)= & {} (0.457\times 0.331)\left[ \frac{3.5}{5}-\left( \frac{2}{4}-\frac{1}{3}\right) -\left( \frac{3}{4}-\frac{2}{3}\right) -\frac{1}{3}-\frac{2}{3}\right] \\+ & {} 0.457\left( \frac{2}{4}-\frac{1}{3}\right) +0.331\left( \frac{3}{4}-\frac{2}{3}\right) =0.021\text{, }\\ \hat{w}^1_\mathbf{q _{-3}}(N)= & {} (0.457\times 0.359)\left[ \frac{3.5}{5}-\left( \frac{2}{4}-\frac{1}{3}\right) -\left( \frac{3}{4}-\frac{2}{3}\right) -\frac{1}{3}-\frac{2}{3}\right] \\+ & {} 0.359\left( \frac{3}{4}-\frac{2}{3}\right) +\frac{2}{3}=0.606\text{. } \end{aligned}$$

In view of line (a), the selected block at the first iteration is \(A^*(1)=\{3\}\). According to lines (b)–(d), \(q_3^3(1)=1\), \(q_1^1(1)=q_1^1(0)+\left( \frac{0.457}{3}\right) /\left( \frac{1}{3}+\frac{2}{4}\right) =0.4\) and \(q_1^{12}(1)=0.6\) as well as \(q_2^{12}(1)=1\). In the second iteration \(t=2\),

$$\begin{aligned} \hat{w}^2_\mathbf{q _{-1}}(\{1\})=\frac{1}{2}&\text { and }\hat{w}^1_\mathbf{q _{-2}}(\{2\})=0\text{, }\\ \hat{w}^2_\mathbf{q _{-1}}(\{1,2\})= & {} 1+\frac{1}{2}\text{, }\\ \hat{w}^2_\mathbf{q _{-2}}(\{1,2\})= & {} 0.6\text{, } \end{aligned}$$

and thus the selected block at the second iteration is \(A^*(2)=\{1,2\}\). In fact, \(\mathcal F^*=\{\{1,2\},\{3\}\}\) and \(\mathcal F^{**}=\{\{1\},\{2,3\}\}\) are the two optimal solutions, with \(W(\mathcal F^*)=W(\mathcal F^{**})=4\), and thus the second loop does not produce any change.

5 Near-Boolean Functions

Boolean functions \(f:\{0,1\}^n\rightarrow \{0,1\}\) of n variables constitute a subclass of pseudo-Boolean functions \(f:\{0,1\}^n\rightarrow \mathbb R\). Following [17, p. 4], denote by \(N=\{1,\ldots ,n\}\) the set of indices of variables. As already explained, any pseudo-Boolean function has a unique expression as a multilinear polynomial in n variables \(x_1,\ldots ,x_n\in [0,1]\) of the form \(f(x_1,\ldots ,x_n)=\mathop {{\sum }}\limits _{A\subseteq N}\left( \alpha _A\mathop {\prod }\limits _{i\in A} x_i\right) \), since \(\alpha _A,A\in 2^N\) is in fact the Möbius inversion \(\mu ^w(A),A\in 2^N\) of a unique [14] set function \(w:2^N\rightarrow \mathbb R\) such that \(w(A)=f(\chi _A)\), where \(\chi _A\) is the characteristic function of A (see above and [1, p. 162]). The n variables thus range each in the unit interval [0, 1]. Such a setting is here expanded by letting each i-th variable, \(i=1,\ldots ,n\) range in a \(2^{n-1}-1\)-dimensional unit simplex \(\varDelta _i\) whose extreme points are indexed by subsets \(A\in 2^N_i\). The goal is to evaluate peculiar collections of fuzzy subsets of N through the MLE given by expressions (1–2).

Let \(ex(\varDelta _N)=\times _{i\in N}ex(\varDelta _i)\). For every n-collection \((q_1,\ldots ,q_n)=\mathbf q \in ex(\varDelta _N)\) of extreme points of simplices \(\varDelta _1,\ldots ,\varDelta _n\) as defined in Sect. 2, denote by \(\hat{A}_1,\ldots ,\) \(\hat{A}_n\in 2^N\) the corresponding subsets, i.e. \(q_i^{\hat{A}_i}=1\) for all \(i\in N\). Let \(P(\mathbf q )\) be the partition obtained by putting in a same block any two \(i,j\in N\) such that \(\hat{A}_i=\hat{A}_j\), i.e. \(P(\mathbf q )=\{A:\hat{A}_i=\hat{A}_j\text { for all }i,j\in A,\hat{A}_{j'}\ne \hat{A}_i\text { for all }j'\in A^c\}\).

Definition 8

Near-Boolean functions of n variables \(q_i\in ex(\varDelta _i),i\in N\) are defined for a given set function \(w:2^N\rightarrow \mathbb R,w(\emptyset )=0\), and have form

$$\begin{aligned} F:ex(\varDelta _N)\rightarrow \mathbb R\text {, with }F(\mathbf q )=\sum _{A\in P(\mathbf q )}w(A)\text { for all }\mathbf q \in ex(\varDelta _N)\text{. } \end{aligned}$$

Definition 9

The MLE \(\hat{F}:\varDelta _N\rightarrow \mathbb R\) of near-Boolean functions F is the polynomial given by expression (2), i.e.

$$\begin{aligned} \hat{F}(\mathbf q )=\sum _{A\in 2^N}\left[ \sum _{B\supseteq A}\left( \prod _{i\in A}q_i^B\right) \right] \mu ^w(A)\text{, } \end{aligned}$$

with \(q_i=(q_i^{A_1},\ldots ,q_i^{A_{2^{n-1}}})\in \varDelta _i,i\in N\).

5.1 Approximations

In line with [17], the issue of approximating a near-Boolean function F for given set function w by means of the least squares LS criterion concerns how to determine a near-Boolean function \(F_k\) such that

$$\begin{aligned} \sum _\mathbf{q \in ex(\varDelta _N)}\left[ F(\mathbf q )-F_k(\mathbf q )\right] ^2 \end{aligned}$$
(8)

attains its minimum over all near-Boolean functions \(F_k\) with polynomial MLE \(\hat{F}_k\) of degree k, that is

$$\begin{aligned} \hat{F}_k(\mathbf q )=\sum _{\underset{|A|\le k}{A\in 2^N}}\left[ \sum _{B\supseteq A}\left( \prod _{i\in A}q_i^B\right) \right] \mu ^{w'}(A)\text{, } \end{aligned}$$

or, equivalently stated in terms of the underlying set function \(w'\) for \(F_k\), such that \(\mu ^{w'}(A)=0\) if \(|A|>k\).

Near-Boolean functions F take values on \(ex(\varDelta _N)\), and \(|ex(\varDelta _N)|=2^{n(n-1)}\). Therefore, they might be regarded as points in a \(2^{n(n-1)}\)-dimensional vector space, i.e. \(F\in \mathbb R^{2^{n(n-1)}}\). In view of Proposition 1 formalizing exactness, this seems conceptually incorrect and with useless enumerative demand. Specifically, for every partition P of N, with associated exact \(\mathbf p \in ex(\varDelta _N)\), there clearly exist many distinct non-exact \(\mathbf q \in ex(\varDelta _N)\) such that \(P=P(\mathbf q )\), entailing \(F(\mathbf q )=F(\mathbf p )\) (see Corollary 1 above). Counting these redundant non-exact n-collections of extreme points of simplices is worthless. Hence k-degree approximation is to be dealt with by replacing expression (8) with the following:

(9)

Let \((\mathcal P^N, \wedge ,\vee )\) denote the (geometric) lattice [13] of partitions of N. The number \(|\mathcal P^N|\) of partitions of a n-set is given by Bell number \(\mathcal B_n\) [13, 14] (see below). Accordingly, near-Boolean functions might be regarded as points \(F\in \mathbb R^{\mathcal B_n}\) in a \(\mathcal B_n\)-dimensional vector space, but this is again too large, as points in such a vector space correspond in fact to generic partition functions, whose Möbius inversion may take non-zero values on any partition \(P\in \mathcal P^N\). Conversely, near-Boolean functions only involve those \(h:\mathcal P^N\rightarrow \mathbb R\) such that \(h(P)=h_w(P)=\sum _{A\in P}w(A)\) for some set function \(w:2^N\rightarrow \mathbb R\). The Möbius inversion of these partition functions may take non-zero values only on the \(2^n-n\) modular elements [13, 18] of lattice \((\mathcal P^N, \wedge ,\vee )\), namely on those partitions with a number of non-singleton blocks \(\le 1\). This is shown below via recursion through the Möbius inversion of additively separable partition functions [19, 20]. When regarded as points in a vector space (i.e. expressed as a linear combination of a basis, see above) these functions must be regarded as \(h_w\in \mathbb R^{2^n-n}\).

It seems crucial emphasizing that while pseudo-Boolean functions admit a unique set function providing their best k-degree approximation, \(0\le k\le n\) [17], every near-Boolean function admits a continuum of set functions w determining their unique best k-degree approximation. In particular, consider first the linear case, i.e. \(k=1\). The issue is to find a best LS approximation \(F_1\) of any given F. That is, the set function \(w'\) determining \(F_1\) has to satisfy \(w'(A)=\sum _{i\in A}w'(\{i\})\) for all \(A\in 2^N\). Then,

$$\begin{aligned} h_{w'}(P)=\sum _{A\in P}w'(A)=\sum _{A\in P}\sum _{i\in A}w'(\{i\})=w'(N) \end{aligned}$$

for all \(P\in \mathcal P^N\). Thus \(h_{w'}\) is a constant partition function, or a valuation [13] of partition lattice \((\mathcal P^N, \wedge ,\vee )\). Also, any further linear \(v:2^N\rightarrow \mathbb R\) such that \(v(N)=w'(N)\) also satisfies \(h_v(P)=h_{w'}(P)\) for all \(P\in \mathcal P^N\). In other terms, there is a continuum of equivalent linear \(v\ne w'\) such that \(h_{w'}=h_v\), obtained each by distributing arbitrarily the whole of \(w'(N)\) over singletons \(\{i\}\in 2^N\). Cases \(k>1\) maintain this same feature: consider a set function \(w'\) such that \(\mu ^{w'}(A)\ne 0\) for one or more (possibly all \(\left( {\begin{array}{c}n\\ k\end{array}}\right) \)) subsets \(A\in 2^N\) such that \(|A|=k\). Now fix arbitrarily n values \(v(\{i\}),i\in N\) with \(\sum _{i\in N}w'(\{i\})=\sum _{i\in N}v(\{i\})\). For all \(A\in 2^N,|A|>1\) Möbius inversion \(\mu ^v:2^N\rightarrow \mathbb R\) can always be determined uniquely via recursion by

$$\begin{aligned} v(A)+\sum _{i\in A^c}v(\{i\})=\sum _{B\subseteq A}\mu ^v(B)+\sum _{i\in A^c}v(\{i\}) \end{aligned}$$
$$\begin{aligned} =w'(A)+\sum _{i\in A^c}w'(\{i\})=\sum _{B\subseteq A}\mu ^{w'}(B)+\sum _{i\in A^c}w'(\{i\})\text{. } \end{aligned}$$

In other terms [19, 20], if set function w additively separates partition function h, i.e. \(h=h_w\), and \(v'=w-v\) is a linear set function, then \(v+v'\) also additively separates h, i.e. \(h_w=h_{v+v'}\). Hence, there is a continuum of equivalent set functions w and \(v+v'\) available for the sought k-degree approximation \(F_k\), but still the \(\mathcal B_n\) values taken by \(F_k\) (more precisely, the \(2^n-n\) values taken by \(F_k\) on the modular elements of partition lattice \((\mathcal P^N, \wedge ,\vee )\)) are unique and independent from the chosen set function in the continuum of set functions \(w,v+v'\) available, each determining an equivalent MLE \(\hat{F}^k\) of \(F^k\). This is detailed hereafter.

5.2 Equivalent Polynomials

Möbius inversion applies to any (locally finite) poset, provided a bottom element exists [14]. For the Boolean lattice \((2^N,\cap ,\cup )\) of subsets of N ordered by inclusion \(\supseteq \) and the geometric lattice \((\mathcal P^N,\wedge ,\vee )\) of partitions of N ordered by coarsening \(\geqslant \) [13, 21], the bottom elements are, respectively, the empty set \(\emptyset \) and the finest partition \(P_{\bot }=\{\{1\},\ldots ,\{n\}\}\). For a lattice \((L,\wedge ,\vee )\) ordered by \(\geqslant \), with generic elements \(x,y,z\in L\) and bottom element \(x_{\bot }\), any lattice function \(f:L\rightarrow \mathbb R\) has Möbius inversion \(\mu ^f:L\rightarrow \mathbb R\) given by \(\mu ^f(x)=\sum _{x_{\bot }\leqslant y\leqslant x}\mu _L(y,x)f(y)\), where \(\mu _L\) is the Möbius function, defined recursively on ordered pairs \((y,x)\in L\times L\) by \(\mu _L(y,x)=-\sum _{y\leqslant z<x}\mu _L(y,z)\) if \(y<x\) (i.e. \(y\leqslant x\) and \(y\ne x\)) as well as \(\mu _L(y,x)=1\) if \(y=x\), while \(\mu _L(y,x)=0\) if \(y\not \leqslant x\). The Möbius function of the subset lattice is \(\mu _{2^N}(B,A)=(-1)^{|A\backslash B|}\), with \(B\subseteq A\). Concerning the Möbius function \(\mu _{\mathcal P^N}\) of \(\mathcal P^N\), given any \(P,Q\in \mathcal P^N\), if \(Q<P=\{A_1,\ldots ,A_{|P|}\}\), then for every \(A\in P\) there are \(B_1,\ldots ,B_{k_A}\in Q\) such that \(A=B_1\cup \cdots \cup B_{k_A}\), with \(k_A>1\) for at least one \(A\in P\). Segment \([Q,P]=\{P':Q\leqslant P'\leqslant P\}\) is isomorphic to product \(\times _{A\in P}\mathcal P(k_A)\), where \(\mathcal P(k)\) is the lattice of partitions of a k-set. Let \(m_k=|\{A:A\in P,k_A=k\}|\) for \(k=1,\ldots ,n\). Then [14, pp. 359–360],

$$\begin{aligned} \mu _{\mathcal P^N}(Q,P)=(-1)^{-n+\sum _{1\le k\le n}m_k}\prod _{1<k<n}(k!)^{m_{k+1}}\text{. } \end{aligned}$$
(10)

A partition function \(h:\mathcal P^N\rightarrow \mathbb R\) may be said to be additively separable [19, 20] if there is a set function \(v:2^N\rightarrow \mathbb R\) such that \(h(P)=\sum _{A\in P}v(A)\) for all \(P\in \mathcal P^N\), with the notation \(h=h_v\). Möbius inversion \(\mu ^{h_v}\) may take non-zero values only on the \(2^n-n\) modular elements of the partition lattice, namely the bottom \(P_{\bot }\) and top \(P^{\top }=\{N\}\), together with all those partitions of the form \(\{A\}\cup P^{A^c}_{\bot }\) for \(A\in 2^N\) such that \(1<|A|<n\), where \(P_{\bot }^{A^c}\) is the finest partition of \(A^c\) [13, Ex. 13, p. 71]. The Möbius inversion of an additively separable partition function \(h_v\) is now detailed [19, Prop. 4.4, p. 138 and Appendix, p. 144], [20, Prop. 3.3, p. 452].

Proposition 7

If \(h=h_v\), then \(h=h_w\) for a continuum of set functions \(w:2^N\rightarrow \mathbb R,w\ne v\).

Proof

By direct substitution,

$$\begin{aligned} \mu ^{h_v}(P)=\sum _{A\in P}\sum _{B\subseteq A}v(B)\sum _{Q\leqslant P:B\in Q}\mu _{\mathcal P^N}(Q,P)\text { for all }P\in \mathcal P^N\text{. } \end{aligned}$$

Now if \(P\ne \{A\}\cup P^{A^c}_{\bot }\), then the recursive definition of \(\mu _{\mathcal P^N}\) yields

$$\begin{aligned} \sum _{P'\leqslant P:A\in P'}\mu _{\mathcal P^N}(P',P)=\sum _{\{A\}\cup P^{A^c}_{\bot }\leqslant P'\leqslant P}\mu _{\mathcal P^N}(P',P)=0\text{, } \end{aligned}$$

and the same for \(B\subset A\). Thus, \(\mu ^{h_v}\) may take non-zero values only on modular partitions, where it obtains recursively by \(\mu ^{h_v}(P_{\bot })=\sum _{i\in N}v(\{i\})\) and \(\mu ^{h_v}(\{A\}\cup P_{\bot }^{A^c})=\mu ^v(A)\) for \(1<|A|<n\) as well as \(\mu ^{h_v}(P^{\top })=\mu ^v(N)\). Accordingly, any \(w\ne v\) satisfying \(\sum _{i\in N}v(\{i\})=\sum _{i\in N}w(\{i\})\) and \(\mu ^v(A)=\mu ^w(A)\) for all \(A\in 2^N\) such that \(|A|>1\) also additively separates h, i.e. \(h_v=h_w\).    \(\square \)

The degree of a polynomial is the highest degree of its terms. In expression (2), for any given set function w, the degree is \(\max \{|A|:\mu ^w(A)\ne 0\}\), while every non-zero value of Möbius inversion \(\mu ^w\) is a coefficient of the polynomial. For any degree \(k,0<k\le n\), there exists a continuum of set functions such that \(\max \{|A|:\mu ^w(A)\ne 0\}=k\), each defining alternative but equivalent coefficients of the polynomial.

Example: let \(N=\{1,2,3,4\}\) and consider the (symmetric) set function \(v:2^N\rightarrow \mathbb R_+\) defined by \(v(A)=|A|^2\), with Möbius inversion \(\mu ^v(A)=1\) if \(|A|=1\) and \(\mu ^v(A)=4-1-1=2\) if \(|A|=2\), while \(\mu ^v(A)=0\) if \(|A|\in \{0,3,4\}\). On partitions P of N, the associated additively separable partition function \(h_v\) thus takes values \(h_v(P)=\sum _{A\in P}|A|^2\). For instance, on partitions \(P^1=12|3|4\), \(P^2=13|24\) and \(P^3=123|4\) (with vertical bar | separating blocks), these values are \(h_v(P^1)=4+1+1=6\), \(h_v(P^2)=4+4=8\) and \(h_v(P^3)=9+1=10\). Now let set function \(w:2^N\rightarrow \mathbb R_+,w(\emptyset )=0\) take values \(w(\{1\})=0=w(\{3\})\) and \(w(\{2\})=2=w(\{4\})\) on singletons, hence \(\sum _{i\in N}v(\{i\})=4=\sum _{i\in N}w(\{i\})\). In order to have \(h_w(P)=h_v(P)\) for all partitions P, it must be \(\mu ^v(A)=\mu ^w(A)\) for all \(A\in 2^N\) such that \(|A|>1\). Hence

$$\begin{aligned} \mu ^w(\{1,2\})=2=\mu ^v(\{1,2\})\Rightarrow & {} w(\{1,2\})=2+w(\{1\})+w(\{2\})=4\text{, }\\ \mu ^w(\{1,3\})=2=\mu ^v(\{1,3\})\Rightarrow & {} w(\{1,3\})=2+w(\{1\})+w(\{3\})=2\text{, }\\ \mu ^w(\{1,4\})=2=\mu ^v(\{1,4\})\Rightarrow & {} w(\{1,4\})=2+w(\{1\})+w(\{4\})=4\text{, }\\ \mu ^w(\{2,3\})=2=\mu ^v(\{2,3\})\Rightarrow & {} w(\{2,3\})=2+w(\{2\})+w(\{3\})=4\text{, }\\ \mu ^w(\{2,4\})=2=\mu ^v(\{2,4\})\Rightarrow & {} w(\{2,4\})=2+w(\{2\})+w(\{4\})=6\text{, }\\ \mu ^w(\{3,4\})=2=\mu ^v(\{3,4\})\Rightarrow & {} w(\{3,4\})=2+w(\{3\})+w(\{4\})=4\text{, }\\ \end{aligned}$$

as well as

$$\begin{aligned} \mu ^w(\{1,2,3\})=0=\mu ^v(\{1,2,3\})\Rightarrow & {} w(\{1,2,3\})=2+2+2+2=8\text{, }\\ \mu ^w(\{1,2,4\})=0=\mu ^v(\{1,2,4\})\Rightarrow & {} w(\{1,2,4\})=2+2+2+2+2=10\text{, }\\ \mu ^w(\{2,3,4\})=0=\mu ^v(\{2,3,4\})\Rightarrow & {} w(\{2,3,4\})=2+2+2+2+2=10\text{, } \end{aligned}$$

and finally

$$\begin{aligned} \mu ^w(N)=0=\mu ^v(N)\Rightarrow w(N)=2+2+2+2+2+2+4=16\text{. } \end{aligned}$$

It is thus readily checked that the desired condition \(h_w(P)=h_v(P)\) for all partitions P holds. In particular, \(h_w(P^1)=4+2=6=h_v(P^1)\) as well as \(h_w(P^2)=2+6=8=h_v(P^2)\) and \(h_w(P^3)=8+2=10=h_v(P^3)\).

5.3 MLE of Partition Functions

Although the lattice \((\mathcal P^N,\wedge ,\vee )\) of partitions of a finite set N is fundamental in combinatorial theory [13], and despite “partitions are of central importance in the study of symmetric functions, a class of functions that pervades mathematics in general” [22, p. 39], still the polynomial multilinear extension has thus far been investigated only for set functions [1, 4]. (On symmetric function theory see [23, Chapt. 5], [24], [25, Chapt. 7].) Therefore, the purpose of this subsection is to briefly present the MLE of partition functions (and thus more generally of functions taking real values on a geometric lattice [13, p. 52]) by focusing on atoms of the partition lattice and on additive separability.

The rank function \(r:\mathcal P^N\rightarrow \mathbb Z_+\) of the partition lattice is \(r(P)=n-|P|\), with \(r(P_{\bot })=0\) for the bottom element. Atoms are immediately above, with rank 1, in the associated Hasse diagram. This latter is ordered by coarsening \(\geqslant \), with coarser partitions in upper levels [13, 21], where \(P\geqslant Q\) means that every block of Q is included in some block of P. Hence atoms are those partitions consisting of \(n-1\) blocks, namely \(n-2\) singletons and one pair. These \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \) unordered pairs \(\{i,j\}\in N_2\) are the same atoms as in subset lattice \((2^{N_2},\cap ,\cup )\), where \(N_2=\{\{i,j\}:1\le i<j\le n\}\) is the \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-set of 2-cardinal subsets of N. For notational convenience, let \([ij]\in \mathcal P^N\) denote the atom where the unique 2-cardinal block is (unordered) pair \(\{i,j\}\in [ij]\).

In order to replace subsets with partitions, let \(\mathcal P^N_{(1)}=\{[ij]:1\le i<j\le n\}\) be the \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-set of atoms of the partition lattice, i.e. \(\mathcal P^N_{(1)}\cong N_2\). The analog of characteristic function \(\chi _A,A\in 2^N\) is indicator function \(I_P:\mathcal P^N_{(1)}\rightarrow \{0,1\}\), with

$$\begin{aligned} I_P([ij])=\left\{ \begin{array}{c}1\text { if }P\geqslant [ij]\text{, }\\ 0\text { if }P\not \geqslant [ij]\text {,} \end{array}\right. \text {for all }P\in \mathcal P^N,[ij]\in \mathcal P^N_{(1)}\text{. } \end{aligned}$$

In words, if pair \(\{i,j\}\) is included in some block A of P (i.e. \(\{i,j\}\subseteq A\in P\)), then partition P is coarser than atom [ij], and the corresponding position \(I_P([ij])\) of indicator array \(I_P\) has entry 1. Otherwise, that position is 0. Hence \(I_P\) is a Boolean \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-vector just like \(\chi _A\) is a Boolean n-vector. As detailed in the previous sections, the MLE of set functions extends them from the \(2^n\)-set \(\{0,1\}^n\) of vertices of the unit n-dimensional hypercube \([0,1]^n\) to the whole of this latter. In order to do the same with partition functions, firstly it must be clear that those Boolean \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-vectors \(I_P,P\in \mathcal P^N\) corresponding to the indicator functions of partitions do not span the whole \(2^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\)-set \(\{0,1\}^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\) of vertices of the \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-dimensional unit hypercube \([0,1]^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\). Minimally, this is already observable when \(N=\{1,2,3\}\) as in the above examples, since there are 5 partitions: the finest \(\{\{1\},\{2\},\{3\}\}\) and coarsest \(\{1,2,3\}\) ones, together with the \(\left( {\begin{array}{c}3\\ 2\end{array}}\right) =3\) atoms \([12]=\{\{1,2\},\{3\}\}\), \([13]=\{\{1,3\},\{2\}\}\) and \([23]=\{\{2,3\},\{1\}\}\). This means that those three vertices of \([0,1]^3\) corresponding to \([12]\vee [23]\), \([12]\vee [13]\) and \([13]\vee [23]\) all collapse on vertex \(I_{P^{\top }}\) corresponding to the coarsest partition \([12]\vee [13]\vee [23]=\{1,2,3\}\) (here the top element among partitions is \(P^{\top }=\{N\}\), and \(I_{P^{\top }}\) denotes the \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-vector with all entries equal to 1; also, \(\vee \) is the coarsest-finer-than operator or join, while \(\wedge \) is the finest-coarser-than operator or meet).

The number \(|\mathcal P^N|\) of partitions of a n-set is the n-th Bell number \(\mathcal B_n<2^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\) (\(n>1\)) [13, pp. 70, 92] given by recursion \(\mathcal B_0=1\) and \(\mathcal B_n=\sum _{0\le k<n}\left( {\begin{array}{c}n-1\\ k\end{array}}\right) \mathcal B_k\). In view of the above argument, the MLE of partition functions extends them from the \(\mathcal B_n\)-set \(\{I_P:P\in \mathcal P^N\}\subset \{0,1\}^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\) of those vertices of the unit \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-dimensional hypercube \([0,1]^{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\) corresponding to the indicator functions of partitions to the whole convex hull \(co(\{I_P:P\in \mathcal P^N\})\) whose extreme points are precisely all \(\mathcal B_n\) Boolean \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \)-vectors given by indicator functions \(I_P,P\in \mathcal P^N\). For notational convenience, let \(\mathbb P_N=co(\{I_P:P\in \mathcal P^N\})\) be such a convex hull. When \(N=\{1,2,3\}\) as above, convex polytope \(\mathbb P_N\) is strictly included in \([0,1]^3\) and its five extreme points [16, 26] are (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1) and (1, 1, 1). Also, \(ex(\mathbb P_N)=\{I_P:P\in \mathcal P^N\}\) is the set of extreme points of \(\mathbb P_N\).

Once this novel geometric perspective is clear, the MLE \(f^h:\mathbb P_N\rightarrow \mathbb R\) of partition functions \(h:\mathcal P^N\rightarrow \mathbb R\) readily obtains by means of the Möbius inversion \(\mu ^h:\mathcal P^N\rightarrow \mathbb R\) detailed in the previous subsection, just in the same way as for the MLE of set functions. In fact, consider \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \) variables \(y_{[ij]_1},\ldots ,y_{[ij]_{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }}\in [0,1]\) indexed by atoms \([ij]_1,\ldots ,[ij]_{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }\in \mathcal P^N_{(1)}\). These are the analog of the n variables \(x_1,\ldots ,x_n\in [0,1]\) indexed by atoms \(\{i\},i\in N\) of Boolean lattice \((2^N,\cap ,\cup )\) used for the MLE \(f^w:[0,1]^n\rightarrow \mathbb R\) of set functions \(w:2^N\rightarrow \mathbb R\) detailed in Sect. 2. For any partition function h, define its MLE \(f^h\) by

$$\begin{aligned} f^h\left( y_{[ij]_1},\ldots ,y_{[ij]_{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }}\right) =\sum _{P\in \mathcal P^N}\left( \prod _{[ij]\leqslant P}y_{[ij]}\right) \mu ^h(P)\text{, } \end{aligned}$$
(11)

with \(\mu ^h\) given by

$$\begin{aligned} \mu ^h(P)=\sum _{Q\in \mathcal P^N}\mu _{\mathcal P^N}(Q,P)h(Q)=\sum _{Q\leqslant P}\mu _{\mathcal P^N}(Q,P)h(Q)=h(P)-\sum _{Q<P}\mu ^h(Q)\text{. } \end{aligned}$$

That is, \(\mu ^h\) is the Möbius inversion of h in view of expression (10) defining the Möbius function of \(\mathcal P^N\) from the previous subsection. Evidently, \(f^h\) is multilinear. To see that (11) is indeed the sought extension, simply note that it coincides with h on \(ex(\mathbb P_N)\), i.e.

$$\begin{aligned} f^h(I_P)=\sum _{Q\in \mathcal P^N}\left( \prod _{[ij]\leqslant Q}I_P([ij])\right) \mu ^h(Q)=\sum _{Q\leqslant P}\mu ^h(Q)=h(P) \end{aligned}$$

for all \(P\in \mathcal P^N\), since

$$\begin{aligned} \prod _{[ij]\leqslant Q}I_P([ij])=\left\{ \begin{array}{c}1\text { if }Q\leqslant P\text{, }\\ 0\text { if }Q\not \leqslant P\text{. } \end{array}\right. \end{aligned}$$

This is because for any partition \(Q\in \mathcal P^N,Q\not \leqslant P\) there is some atom \([ij]^*\) such that \(Q\geqslant [ij]^*\not \leqslant P\), entailing \(I_P([ij]^*)=0\) and thus \(\mathop {{\prod }}\limits _{[ij]\leqslant Q} I_P([ij])=0\). Note that the analog of convention \(\mathop {{\prod }}\limits _{i\in \emptyset } x_i:=1\) applying to set functions [1, p. 157] now is \(\mathop {{\prod }}\limits _{[ij]\leqslant P_{\bot }} y_{[ij]}:=1\), or more generally \(\mathop {{\prod }}\limits _{[ij]\in \emptyset } y_{[ij]}:=1\), hence for any \(y=\left( y_{[ij]_1},\ldots ,y_{[ij]_{\left( {\begin{array}{c}n\\ 2\end{array}}\right) }}\right) \in \mathbb P_N\) it holds

$$\begin{aligned} f^h(y)=h(P_{\bot })+\sum _{P>P_{\bot }}\left( \prod _{[ij]\leqslant P}y_{[ij]}\right) \mu ^h(P)\text{. } \end{aligned}$$

This enables to briefly check the functioning for additively separable partition functions \(h_w\), in that the results obtained in the previous subsection yield

$$\begin{aligned} f^{h_w}(y)= & {} h_w(P_{\bot })+\underset{P\text { modular}}{\sum _{P>P_{\bot }}}\left( \prod _{[ij]\leqslant P}y_{[ij]}\right) \mu ^{h_w}(P)\\= & {} \sum _{i\in N}w(\{i\})+\underset{|A|>1}{\sum _{A\in 2^N}}\left( \prod _{\{i,j\}\subseteq A}y_{[ij]}\right) \mu ^w(A)\text{. } \end{aligned}$$

Finally, for the sake of completeness, it can be mentioned that symmetric set functions w satisfy \(|A|=|B|\Rightarrow w(A)=w(B)\) (or \(|A|=|B|\Rightarrow \mu ^w(A)=\mu ^w(B)\)) for all \(A,B\in 2^N\). On the other hand, symmetric partition functions h satisfy \(c^P=c^Q\Rightarrow h(P)=h(Q)\) (or \(c^P=c^Q\Rightarrow \mu ^h(P)=\mu ^h(Q)\)) for all \(P,Q\in \mathcal P^N\), where \(c^P=(c^P_1,\ldots c^P_n)\in \mathbb Z^n_+\) is the class of P, i.e. \(c^P_k=|\{A:A\in P,|A|=k\}|\) is the number of k-cardinal blocks of P, for \(1\le k\le n\) [14]. Therefore, if \(h_w\) is additively separated by a symmetric set function w, then \(h_w\) is symmetric as well. Nevertheless, within the continuum of set functions additively separating \(h_w\), only w is symmetric, while all others addively separating set functions v (i.e. such that \(h_w=h_v\)) are not symmetric.

6 Near-Boolean Games

In view of the above definition of local maximizers relying on equilibrium conditions for strategic n-player games, and having mentioned additive separablity of partition functions or global games [19], it seems now useful to regard variables as players in near-Boolean coalition formation games (for pseudo-Boolean functions and coalitional games see [17, Sect. 3]).

Definition 10

A near-Boolean n-player game is a triple \((N,F,\pi )\) such that \(N=\{1,\ldots ,n\}\) is the player set and F is a near-Boolean function taking real values on profiles \(\mathbf q =(q_1,\ldots ,q_n)\in ex(\varDelta _N)\) of strategies, while payoffs are efficiently assigned by \(\pi :ex(\varDelta _N)\rightarrow \mathbb R^n\) to players, i.e. \(\pi (\mathbf q )=(\pi _1(\mathbf q ),\ldots ,\pi _n(\mathbf q ))\) satisfies \(\sum _{i\in N}\pi _i(\mathbf q )=F(\mathbf q )\) at all \(\mathbf q \in ex(\varDelta _N)\).

Definition 11

A fuzzy near-Boolean n-player game is a triple \((N,\hat{F},\pi )\) such that N is the player set and \(\hat{F}\) is the MLE of a near-Boolean function taking values on profiles \(\mathbf q =(q_1,\ldots ,q_n)\in \varDelta _N\) of strategies, while \(\pi :\varDelta _N\rightarrow \mathbb R^n\) efficiently assigns payoffs to players, i.e. \(\sum _{i\in N}\pi _i(\mathbf q )=\hat{F}(\mathbf q )\) at all \(\mathbf q \in \varDelta _N\).

In both near-Boolean games and fuzzy ones the player set is finite. Given this, a main distinction is between games where players have either finite or else infinite sets of strategies, with near-Boolean games in the former class and fuzzy ones in the latter. In addition, players may play either deterministic (i.e. pure) or else random (i.e. mixed) strategies. In the latter case equilibrium conditions are stated in terms of expected payoffs, and by means of fixed point arguments for upper hemicontinuous correspondences such conditions are commonly fulfilled [15, p. 260]. The sets of deterministic strategies in fuzzy near-Boolean games are precisely the sets of random strategies in near-Boolean games. Nevertheless, the payoffs for the fuzzy setting are not, in general, expectations.

The framework where these games seem useful is coalition formation, which combines both strategic and cooperative games [27]. A generic strategy profile \(\mathbf q \in ex(\varDelta _N)\) of near-Boolean (non-fuzzy) games may well fail to be exact, but it is plain from Sect. 5 that the partition \(P(\mathbf q )\) of N with associated exact \(\mathbf p \in ex(\varDelta _N)\) satisfies \(F(\mathbf p )=F(\mathbf q )\). In this view, near-Boolean games model strategic coalition formation in a very handy manner, in that they totally by-pass the need to define a mechanism mapping strategy profiles into partitions of players or coalition structures. More precisely, a mechanism is a mapping \(M:ex(\varDelta _N)\rightarrow \mathcal P^N\) such that when each player \(i\in N\) specifies a coalition \(A_i\in 2^N_i\), then \(M(A_1,\ldots ,A_n)=P\) is a resulting coalition structure. If the n specified coalitions \(A_i,i\in N\) are such that for some partition P it holds \(A_i=A\) for all \(i\in A\) and all \(A\in P\), then \(M(A_1,\ldots ,A_n)=P\). Otherwise, the generated partition \(P'=M(A_1,\ldots ,A_n)\) depends on what mechanism is chosen, and generally may be a rather fine one, i.e. possibly consisting of many small blocks. Conversely, near-Boolean games do not need any mechanism, in that even if players’ strategies \((q_1,\ldots ,q_n)=\mathbf q \) are such that q is not exact and thus does not yield a partition, still the global worth \(F(\mathbf q )=F(\mathbf p )\) is that attained at the partition \(P(\mathbf q )\) (with associated exact p) whose blocks \(A\in P\) each include maximal subsets of players choosing the same superset \(A'\supseteq A\).

For a (non-fuzzy) near-Boolean game \((N,F,\pi )\), with underlying set function or coalitional game \(w:2^N\rightarrow \mathbb R_+,w(\emptyset )=0\), consider the case where payoffs \(\pi \) are defined as follows: for every block \(A\in P(\mathbf q )\) of the partition \(P(\mathbf q )\) associated with any strategy profile \(\mathbf q \in ex(\varDelta _N)\) and for every player \(i\in A\), let

$$\begin{aligned} \pi _i(\mathbf q )=\sum _{B\in 2^A\backslash 2^{A\backslash i}}\frac{\mu ^{w}(B)}{|B|}\text{. } \end{aligned}$$

This is in fact the near-Boolean version of a well-known coalition formation game, with payoffs given by the Shapley value [28].

Definition 12

For a near-Boolean function F, any \(\mathbf q \in ex(\varDelta _N)\) is a local maximizer if for all \(i\in N\) and all \(q'_i\in ex(\varDelta _i)\) inequality \(F(\mathbf q )\ge F(q_i'|\mathbf q _{-i})\) holds.

If payoffs are given by \(\pi _i(\mathbf q )=\frac{\omega _iF(\mathbf q )}{\sum _{j\in N}\omega _j}\) with \(\omega _j>0\) for all \(j\in N\), then near-Boolean games are (pure) common interest potential games [29, 30], meaning that the set of equilibria of \((N,F,\pi )\) coincides with the set of local maximizers of F, and players’ preferences all agree on the set of strategy profiles.

7 Conclusions

Near-Boolean optimization is conceived to tackle problems where the instance includes a set function w taking real values on a family of subsets of a finite set, and with the objective function assigning to any subfamily of pair-wise disjoint subsets the sum of their values quantified by w. Typically, such problems are set partitioning and set packing. Like in pseudo-Boolean optimization [1,2,3], the proposed method employs the polynomial multilinear extension of set functions to turn the discrete setting where the given problems are originally formulated into a continuous one, while extremizers are found where each variable is at an extreme point of its domain.

A main difference between pseudo-Boolean functions and near-Boolean ones is that the former coincide, in fact, with set functions, while for the latter the given w may be replaced with a continuum of \(w'\ne w\), and the associated polynomials all have common degree but different coefficients. This feature only characterizes additively separable partition functions [19, 20], while for generic ones a novel MLE has been developed, with the corresponding derivatives and approximation issues (approached as in [17]) left for future work.

The near-Boolean setting also provides a novel type of coalition formation games, where the need to map strategy profiles into partitions of players by means of a mechanism is totally by-passed. In artificial intelligence, these games may model coalition structure generation in multiagent systems [31, 32].

Finally, near-Boolean optimization seems generally interesting for objective function-based clustering [33], and for graph clustering in particular [34]. Specifically, for a given weighted graph, edge weights and vertices’ weighted degrees may be used (in alternative ways) to obtain a quadratic MLE, i.e. a polynomial with degree 2. Then, optimization with respect to the resulting objective function provides a method for partitioning (i.e. clustering) the vertex set.