Efficient Counting of the Number of Independent Sets on Polygonal Trees

De Ita, Guillermo; Bello, Pedro; Contreras, Meliza; Catana-Salazar, Juan C.

doi:10.1007/978-3-319-39393-3_17

Guillermo De Ita¹⁸,
Pedro Bello¹⁸,
Meliza Contreras¹⁸ &
…
Juan C. Catana-Salazar¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9703))

Included in the following conference series:

Mexican Conference on Pattern Recognition

1308 Accesses

Abstract

We present a method to compute the number of independent sets for two basic graph patterns: simple cycles and trees. We show how to extend this initial method for processing efficiently more complex graph topologies.

We consider polygonal array graphs that are a graphical representation of molecular compounds. We show how our method processes polygonal trees based on a 2-treewidth decomposition of the input graph.

A pair of macros is associated to each basic graphic pattern in this decomposition, allowing us to represent and perform a series of repetitive operations while the same pattern graph is found. This results in a linear-time algorithm for counting its number of independent sets.

You have full access to this open access chapter, Download conference paper PDF

Basic Pattern Graphs for the Efficient Computation of Its Number of Independent Sets

Computing the Clique-Width of Polygonal Tree Graphs

Automated Detection of Interesting Properties in Regular Polygons

Article Open access 25 June 2020

Keywords

1 Introduction

Counting problems are not only mathematically interesting, also they arise in many applications. Regarding hard counting problems, the computation of the number of independent sets of a graph G, denoted as NI(G), has been a key in determining the frontier between efficient counting and intractable counting algorithms.

The recognition of structural patterns lying on graphs has been helpful to design efficient algorithms for counting its number of independent sets. For example, the linear-time Okamoto’s algorithm [2] computes NI(G) when G is a chordal graph, and where the decomposition of G in its clique tree gives the possibility of applying dynamic programming in an efficient way. Other case, is the Zhao’s algorithm for computing NI(G) on regular graphs [9]. However, for some kind of graphical patterns there is not an efficient method known, at least up to today, to compute NI(G).

Polygonal array graphs have been widely investigated, and they represent a relevant area of interest in mathematical chemistry because they have been used to study intrinsic properties of molecular graphs. In addition, it is also of great importance to recognize substructures of those compounds and learn messages from the graphic model by clear elucidation of their structures and properties [4]. There are several works analyzing extremal values for the number of independent sets (known in mathematical chemistry area as the Merrifield-Simmons index) on hexagonal chain graphs, but none of those works have presented the methods applied to compute that index [4, 5].

The use of macros (common structural patterns) that are basic tools used in planning community, has been a common tool in AI (Artificial Intelligence), in order to represent accumulative series of basic operations into a plan action [6]. A main property of the macros is the possibility to represent accumulative preconditions and effects by making macros indistinguishable from individual operators, and allowing them to perform efficiently, a series of repetitive operations while the same pattern graph is found, as it occurs in the case of polygonal tree graphs.

Decompositions of graphs such as clique separators, treewidth decomposition and clique decomposition are often used to design efficient graph algorithms. There are even wonderful general results stating that a variety of NP-complete graph problems can be solved in linear time for graphs of bounded treewidth and bounded clique-width, respectively [7]. In order to obtain an efficient algorithm using this approach, the input graphs have to be restricted to a graph class nicely treewidth decomposable.

We show here, that a polygonal tree G has a 2-treewidth decomposition allowing the application of macros for computing NI(G) efficiently. Our algorithm could be adapted as a computational tool for mathematical chemistry researchers, to contribute in the analysis of intrinsic properties on those molecular graphs. In fact, our algorithm can be adapted to compute also other intrinsic properties on molecular graphs; for example, for counting the number of matching edges of a chemical compound, known as the Hosoya index.

2 Notation

Let $G=(V,E)$ be an undirected graph with vertex set (or node set) V and set of edges E. The neighborhood for $x \in V$ is $N(x) = \{y \in V : \{x,y\} \in E\}$, and its closed neighborhood is $N(x) \cup \{x\}$ which is denoted by N[x]. We denote the cardinality of a set A, by |A|. The degree of a vertex x, denoted by $\delta (x)$, is |N(x)|, and the degree of G is $\varDelta (G) = max\{ \delta (x) : x \in V\}$. The size of the neighborhood of x, $\delta (N(x))$, is $\delta (N(x))=\sum _{y \in N(x)}\delta (y)$.

A path from v to w is a sequence of edges: $v_0 v_1, v_1 v_2, \ldots ,v_{n-1} v_n$ such that $v=v_0$ and $v_n=w$ and $v_k$ is adjacent to $v_{k+1}$, for $0 \le k < n$. The length of the path is n. A simple path is a path where $v_0, v_1, \ldots , v_{n-1}, v_n$ are all distinct. A cycle is a non-empty path such that the first and last vertices are identical, and a simple cycle is a cycle in which no vertex is repeated, except that the first and last vertices are identical. A graph G is acyclic if it has no cycles. $P_n$, $C_n$, $K_n$, denote respectively, a path graph, a simple cycle and the complete graph, all of those graphs have n vertices.

Given a graph $G=(V,E)$, let $G'=(V',E')$ be a subgraph of G if $V' \subseteq V$ and $E'$ contains edges $\{v,w\} \in E$ such that $v \in V'$ and $w \in V'$. If $E'$ contains every edge $\{v,w\} \in E$ where $v \in V'$ and $w \in V'$ then $G'$ is called the induced graph of G. A connected component of G is a maximal induced subgraph of G, that is, a connected component is not a proper subgraph of any other connected subgraph of G. If an acyclic graph is also connected, then it is called a free tree.

Given a graph $G=(V,E)$, $S \subseteq V$ is an independent set in G if for every two vertices $v_1$, $v_2$ in S, $\{v_1,v_2\} \notin E$. Let I(G) denote the set of all independent sets of G. An independent set $S \in I(G)$ is maximal if it is not a subset of any larger independent set and, it is maximum if it has the largest size among all independent sets in I(G). The determination of the maximum independent set has received much attention since it is an NP-complete problem.

The corresponding counting problem on independent sets, denoted by NI(G), consists of counting the number of independent sets of a graph G. NI(G) is a #P-complete problem for graphs G where $\varDelta (G) \ge 3$.

3 Counting Independent Sets on Basic Topology Graphs

In this section we present two algorithms to compute NI(G) for two basic graph patterns: cycles and trees.

Since $NI(G) = \prod _{i=1}^k NI(G_i)$ where $G_i, i=1,\ldots ,k$, are the connected components of G [3], then the total time complexity for computing NI(G), denoted as T(NI(G)), is given by the maximum rule as $T(NI(G)) = max\{ T(NI(G_i))$ : $G_i$ is a connected component of $G \}$. Thus, a first helpful decomposition of the graph is done via its connected components, and from here on, we consider as an input graph only one connected component.

It turns out that the combinatorial meaning of the Fibonacci numbers are closely related to the number of independent sets of some kind of basic graphs patterns. In fact, it is shown in [8] that $F_{n+2}$ is equal to the number of subsets (including the empty set) in $\{1,2,...,n\}$, such that no two elements are adjacent, i.e. there is no two consecutive integers in any subset. If we think of $\{1,2,...,n\}$ as the vertex set of a path graph, say $P_n$, where there exists an edge $e_i=\{i,i+1\}, \; i=1, \ldots ,n-1$, for each pair of sequential nodes, then $F_{n+2}$ is equal to $NI(P_n)$.

On the other hand, if we consider that the n-th element in $\{1,2,...,n\}$ is adjacent to the first vertex, then $P_n$ turns into a simple cycle $C_n$, and the number of subsets (including the empty set) with no two adjacent elements is characterized by the n-th Lucas number. Therefore, $L_n$ is equal to $NI(C_n)$ [8].

3.1 Cycles

Let $C_n=(V,E)$ be a simple cycle graph, and $|V|=n=|E|=m$, i.e. every node in V has degree two.

Theorem 1

The number of independent sets in $C_n$ is equal to $F_{n+1} + F_{n} - F_{n-2}$.

Proof

We decompose the cycle $C_n$ as: $G \cup \{c_m\}$, where $G = (V,E')$, $E'=\{c_1, ..., c_{m-1}\}$. G is a path of n nodes, and $c_m = \{v_m, v_1\}$ is called the back edge of the cycle.

We build the family $\mathcal {F}_i=\{G_{i}\}, i=1,\ldots ,n$ where $G_i=(V_i,E_i)$ is the induced graph of G formed by just the first i nodes of V.

We associate to each node $v_{i} \in V$ a pair $(\alpha _{i},\beta _{i})$, where $\alpha _{i}$ represents the number of sets in $I(G_{i})$ where $v_{i}$ does not appear, while $\beta _i$ conveys the number of sets in $I(G_{i})$ where $v_{i}$ appears, thus $NI(G_{i}) = \alpha _{i} + \beta _{i}$. The first pair $(\alpha _{1}, \beta _{1})$ is (1, 1) since for the induced subgraph $G_{1} =\{v_{1}\}$, $I(G_{1})=\{\emptyset ,\{v_{1}\}\}$. It is not hard to see that a new pair $(\alpha _{i+1},\beta _{i+1})$ is built from the previous one by a Fibonacci sequence, as shown in Eq. 1.

$$\begin{aligned} ({\alpha _{i+1}},{\beta _{i+1}}) = ( {\alpha _{i}} + {\beta _{i}},{\alpha _{i}}) \end{aligned}$$

(1)

Note that every independent set in G is an independent set in $C_n$, except for the sets $S \in I(G)$ where $v_1 \in S$ and $v_m \in S$. In order to eliminate those conflicting sets, we use two computing threads denoted by $\alpha \beta -$pairs to compute $NI(C_n)$, one of those for computing NI(G) and the other one: ($\alpha '_i$,$\beta '$ $_i$), for computing $|\{S \in I(G): v_1 \in S \wedge v_m \in S\}|$.

The second thread $\alpha \beta -$pair is done by starting $(\alpha '_1,\beta '_1) = (0,1)$, and considering only the independent sets of I(G) where $v_1$ appears (0, $\beta _1$).

Expressing the computation of $NI(C_m)$ in terms of Fibonacci numbers, we have ($\alpha '_1$,$\beta '_1) = (0,1)= (F_0,F_1) \rightarrow (\alpha '_2,\beta '_2)=(1,0)= (F_1,F_0) \rightarrow (\alpha '_3,\beta '_3)= (1,1)= (F_2,F_1), \ldots ,$ $(\alpha '_n,\beta '_n)= (F_{n-1},F_{n-2})$, and the value for the final pair is $(0, F_{n-2})$, then $|\{S \in I(G') : v_1 \in S \wedge v_n \in S \}| = 0 + \beta _n = F_{n-2}$. Then, the last pair associated to the computation of $NI(C_n)$ is $(F_{n+1}, F_{n} - F_{n-2}) = (F_{n+1}, F_{n-1})$. Then, $NI(C_n) = F_{n+1} - F_{n-1}$, obtaining a well known identity, the n-th Lucas number. $\square $

3.2 Trees

Let $T=(V,E)$ be a rooted tree at a vertex $v_r \in V$. We denote with $(\alpha _{v}, \beta _{v})$ the pair associated with the node v ($v \in V(T))$. We compute NI(T) while we are traversing by T in post-order.

This algorithm returns the number of independent sets of G in time $O(n+m)$ which is the necessary time for traversing G in post-order.

We call Linear_NI to the algorithm that computes NI(G) when G is a simple cycle or a tree. Linear_NI will be applied to process any acyclic graph or simple cycles that we find as part of a more complex graph.

4 Polygonal System Graphs

Let $G=(V,E)$ be a molecular graph. Denote by n(G, k) the number of ways in which k mutually independent vertices can be selected in G. By definition, $n(G, 0) = 1$ for all graphs, and $n(G, 1) = |V(G)|$. Then $\sigma (G) = \sum _{k\ge 0} n(G,k)$ will be the Merrifield-Simmons index of G, that is, exactly the number of independent sets of G. Merrifield and Simmons showed the correlation between NI(G) and boiling points on polygonal chain graphs representing chemical molecules [4, 5]. The Merrifield-Simmons index is a typical example of a graph invariant used in mathematical chemistry for quantifying relevant details of molecular structure.

A polygonal chain is a graph $P_{k,t}$ obtained by identifying a finite number of t congruent regular polygons, and such that each basic polygon, except the first and the last one, is adjacent to exactly two basic polygons. When each polygon in $P_{k,t}$ has the same number of k nodes, then $P_{k,t}$ is a linear array of t k-gons.

The way that two adjacent k-gons are joined, via a common vertex or via a common edge, defines different classes of polygonal chemical compounds. A special class of polygonal chains is the class of hexagonal chains, chains formed by n 6-gons. Hexagonal systems play an important role in mathematical chemistry as natural representations of catacondensed benzenoid hydrocarbons.

Let $T_P = H_1H_2 \cdots H_n$ be a hexagonal chain with n hexagons, where each $H_i$ and $H_{i+1}$ have a common edge for each $i = 1, 2, \ldots , n - 1$. A hexagonal chain with at least two hexagons has two end-hexagons: $H_1$ and $H_{n}$, while $H_2, \ldots , H_{n-1}$ are the internal hexagons of the chain. If the array of hexagons follows the structure of a tree where instead of nodes we have hexagons, and any two consecutive hexagons share exactly one edge, then we call to that graph a hexagonal tree (see Fig. 1). And when the nodes in the tree are represented by any k-gon then we call that graph a polygonal tree.

The propensity of carbon atoms to form compounds, made of hexagonal arrays fused along the edges gives a relevant importance to the study of chemical properties of benzenoid hydrocarbons. Thus, it has been paralleled by the study of its corresponding graphs, the so-called polygonal tree graphs. Those graphs have been widely investigated and represent a relevant area of interest in mathematical chemistry, since it is used for quantifying relevant details of the molecular structure of the benzenoid hydrocarbons [1, 4]. We show now, how to count the number of independent sets for polygonal trees based on the use of macros.

4.1 Counting Independent Sets on Polygonal Trees

A simple cycle graph of length n, $C_n$, represents a polygon of n sides. Algorithm 1 allow us to compute $NI(C_n)$ for any polygon $C_n$. But, if we apply Algorithm 1 using symbolic variables $(\alpha , \beta )$ as the associated pair to each node in the polygon, then we can compute the Merrifield-Simmon index for any chain of polygons at the same time that we visit each edge of the array once.

Let us show how to use symbolic variables during the computation of $NI(P_4)$. The two computing threads and its associated pairs are expressed by the symbolic variables: $\alpha $ and $\beta $, as shown in Eq. 2.

(2)

Thus we obtain Eq. 3:

$$\begin{aligned} NI(P_4) = 3 \alpha + 2 \beta + 2 \alpha = 5 \alpha + 3 \beta \end{aligned}$$

(3)

In fact, applying Algorithm 1 and using symbolic variables for computing $NI(P_k)$, where $P_k$ is a polygon of k nodes, we obtain a last pair: $(F_k \alpha + F_{k-1} \beta , F_{k-1} \alpha )$ which it is a system of linear equations, where $F_k$ is the k-nth Fibonacci number. Thus, using symbolic variables $\alpha $ and $\beta $, we obtain Eq. 4.

$$\begin{aligned} NI(P_k) = (F_k + F_{k-1}) \alpha + F_{k-1} \beta \end{aligned}$$

(4)

We can codify the last pair $(F_k \alpha + F_{k-1} \beta , F_{k-1} \alpha )$ obtained from the computation of $NI(P_k)$ as a macro, that in terms of the initial symbolic variables $\alpha $, $\beta $ can be written as shown in Eq. 5.

$$\begin{aligned} \alpha ' = F_k \alpha + F_{k-1} \beta , \;\beta ' = F_{k-1} \alpha \end{aligned}$$

(5)

This part of forming the pair of linear equations, is called the formation of the macro. Each symbolic variable in the macro: $(\alpha ', \beta ')$ represents a linear system on two symbolic initial variables: $\alpha $ and $\beta $. This macro would be associated to a common edge between any two polygons.

For example, let $P_1 \circ P_2$ be two contiguous hexagons with common edge $\{x,y\}$. Algorithm 1 computes $NI(P_1)$ beginning at the common node (between both polygons) x, and at the end of the algorithm, the pair $(F_k \alpha + F_{k-1} \beta , F_{k-1} \alpha )$ is obtained after visiting node y. Such pair of linear equations is associated to the common edge $\{x,y\}$, indicating that it does not matter the initial values for any $\alpha $ and $\beta $, they can be substituted by a current pair of values in order to obtain a current final pair of linear equations and such that in those new equations, the value $NI(P_1)$ has been considered as part of its accumulative operations.

The expansion of the macro $(\alpha ', \beta ')$ consists in considering the linear equations that they represent and substitute the symbolic variables $\alpha $ and $\beta $ by its current values represented by new variables. This process of expansion is well-defined since no macro appears in its own expansion.

When the computation of $NI(P_2)$ is started by Algorithm 1, it begins in the node $v_1$ of $P_2$, and when the node x is visited, a current pair of symbolic variables $(\chi , \delta )$ has been computed in the main line $L_P$ and other pair $(\varphi , \gamma )$ has been obtained in the secondary line $L_{P_2}$. To visit and process the common edge $\{x,y\}$ for $P_2$ implies to substitute the variables $\alpha $ and $\beta $ in the macro $(F_k \alpha + F_{k-1} \beta , F_{k-1} \alpha )$ by $\chi $ and $\delta $ in the main thread $L_P$, and by $\varphi $ and $\gamma $ in the secondary thread $L_{P_2}$. Algorithm 1 continues with those new pairs of linear equations in both threads, and for the application of the Fibonacci recurrence, such pairs will be updated until arrive to the last edge of $P_2$, where $v_1$ is one of its end-points.

At the end of processing a new polygon, we obtain again a new pair of linear equations $(a \cdot \chi + b \cdot \delta , c \cdot \chi + d \cdot \delta )$ as a result for $NI(P_1 \circ P_2)$. For using symbolic variables in the last pair obtained through the processing of polygons, it allows us to substitute such macro with current variables $\alpha $ and $\beta $, updating the macro associated to the common edge. And in this way we can process any polygonal chain with any number of polygons in linear time, on the number of edges in the array.

5 A 2-Treewidth Decomposition for Polygonal Trees

Treewidth is one of the most basic parameters in graph algorithms. There is a well established theory on the design of polynomial (or even linear) time algorithms for many intractable problems when the input is restricted to graphs of bounded treewidth. What is more important, there are many problems on graphs with n vertices and treewidth at most k that can be solved in time $O(c^k \cdot n^{O(1)})$, where c is some problem dependent constant [7].

For example, a maximum independent set (a MIS) of a graph can be found in time $O(2^k \cdot n)$ given a tree decomposition of width at most k. So, a quite natural approach to solve the NI(G) problem would be to find a treewidth $T_G$ of G and to determine how to join the partial results on the nodes of $T_G$. However, given a general graph G, finding its treewidth is an NP-complete problem.

The treewidth decomposition of $T_P$-polygonal tree graph is based in the well-known 2-treewidth decomposition of any simple cycle (a polygon); and as any two adjacent polygons have just one common edge, then it is enough to join the 2-treewidth decomposition of the two contiguous polygons with the nodes containing the common edge, as it is shown in Fig. 1.

Let $T_G=(T,F)$ be the 2-tree decomposition of $T_P=(V,E)$, where $T=(I,F)$ is a tree, I is an index set and X is a function, $X : I \rightarrow 2^{V}$, satisfying the tree constraints of any k-tree decomposition [7]. We refer to $x \in V(G)$ as vertex and $X(i) \in T$ as nodes of T. A vertex x is associated with a node $i \in I$, or vice versa, whenever $v \in X(i)$. Our treewidth decomposition of $T_P$ keeps the structure of a tree, and then we can apply the Algorithms 2 and 3 for computing $NI(T_P)$.

5.1 Processing Polygonal Trees

In Figs. 1 and 2, we show the 2-treewidth decomposition of a polygonal tree graph, as well as the application of macros for counting independent sets on the nodes of the tree. Notice that all common edges are visited twice; the first one, a macro is formed, and in the second one, an expansion of the macro is performed. This provides a linear-time algorithm that traverses by all node and edge on the treewidth.

6 Conclusions

We have proposed a 2-treewidth decomposition for any polygonal tree graph $T_P$, where all common edge between two adjacent polygons will appear exactly in two consecutive nodes of the treewidth. This structure for the treewidth allows the efficient application of macros for solving counting problems on $T_P$.

We present a novel linear-time algorithm for counting independent sets on $T_P$. Our method exploits the use of macros for performing repetitive operations appearing on the basic common patterns (polygons) that form $T_P$. Our algorithm can be adapted to solve other intrinsic properties on polygonal tree graphs, impacting directly on the time complexity of the algorithms for solving those problems.

References

Došlić, T., Måløy, F.: Chain hexagonal cacti: matchings and independent sets. Discrete Math. 310, 1676–1690 (2010)
Article MathSciNet MATH Google Scholar
Okamoto, Y., Uno, T., Uehara, R.: Linear-time counting algorithms for independent sets in chordal graphs. In: Kratsch, D. (ed.) WG 2005. LNCS, vol. 3787, pp. 433–444. Springer, Heidelberg (2005)
Chapter Google Scholar
Roth, D.: On the hardness of approximate reasoning. Artif. Intell. 82, 273–302 (1996)
Article MathSciNet Google Scholar
Wagner, S., Gutman, I.: Maxima and minima of the Hosoya index and the Merrifield-Simmons index. Acta Applicandae Math. 112(3), 323–346 (2010)
Article MathSciNet MATH Google Scholar
Deng, H.: Catacondensed benzenoids and phenylenes with the extremal third-order Randic Index. Comm. Math. Comp. Chem. 64, 471–496 (2010)
MathSciNet MATH Google Scholar
Bäckström, C., Jonsson, A., Jonsson, P.: Automaton plans. J. Artif. Intell. Res. 51, 255–291 (2014)
MathSciNet MATH Google Scholar
Fomin, F.V., Gaspers, S., Saurabh, S., Stepanov, A.A.: On two techniques of combining branching and treewidth. Algorithmica 54, 181–207 (2009)
Article MathSciNet MATH Google Scholar
Prodinger, H., Tichy, R.F.: Fibonacci numbers of graphs. Fibonacci Q. 20(1), 16–21 (1982)
MathSciNet MATH Google Scholar
Zhao, Y.: The number of independent sets in a regular graph. Comb. Probab. Comput. 19(02), 315–320 (2010)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Facultad de Ciencias de la Computación, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
Guillermo De Ita, Pedro Bello & Meliza Contreras
Posgrado en Ciencia e Ingeniería de la Computación, Universidad Nacional Autónoma de México, Mexico City, Mexico
Juan C. Catana-Salazar

Authors

Guillermo De Ita
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Bello
View author publications
You can also search for this author in PubMed Google Scholar
Meliza Contreras
View author publications
You can also search for this author in PubMed Google Scholar
Juan C. Catana-Salazar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guillermo De Ita .

Editor information

Editors and Affiliations

INAOE, Sta. Maria Tonantzintla, Mexico
José Francisco Martínez-Trinidad
INAOE, Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco-Ochoa
University of Guanajuato, Salamanca, Mexico
Victor Ayala Ramirez
Autonomous University of Puebla, Puebla, Mexico
José Arturo Olvera-López
University of Münster, Münster, Nordrhein-Westfalen, Germany
Xiaoyi Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Ita, G., Bello, P., Contreras, M., Catana-Salazar, J.C. (2016). Efficient Counting of the Number of Independent Sets on Polygonal Trees. In: Martínez-Trinidad, J., Carrasco-Ochoa, J., Ayala Ramirez, V., Olvera-López, J., Jiang, X. (eds) Pattern Recognition. MCPR 2016. Lecture Notes in Computer Science(), vol 9703. Springer, Cham. https://doi.org/10.1007/978-3-319-39393-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-39393-3_17
Published: 21 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39392-6
Online ISBN: 978-3-319-39393-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)