1 Introduction

Search is a problem solving mechanism in AI, and the choice of search procedure is a prescription for determining in what order the nodes in a problem are to be generated and examined. In blind search techniques, this procedure is achieved by searching for a goal without using any information. On the other hand, heuristic search techniques use partial information about the problem domain to guide the search from a start node towards a goal node. The algorithm terminates if a solution is found or no solution has been found. If the algorithm employed heuristics it is called a heuristic search. If it does not use any heuristics then it is called a blind search algorithm.

Each node in a search tree is a state of a search problem; the state space [12, 13] is a set of states and operators that maps states to states. Search problems can be represented with a graph \( G(N,V) \), where N is a set of nodes such that \( N = \{ 1, 2, 3, 4, \ldots n\} \), and V is a set of vertices connecting nodes such that \( V = \{ a_{1} , a_{2} , a_{3} \ldots a_{m} \} \). A vertex is \( a_{t} = \left( { i, j } \right) \in V \). A path P is a subset of N where each node of the path is connected, e.g. \( P = \{ n_{1} , n_{2} , n_{3} \ldots n_{n } \} \) [12]. The cost of travel from vertex i to j is \( c(i, j) \). The purpose of search algorithms is to find a desirable path that maps a start node to a goal node by applying suitable operations to each state of a search tree.

There is many search methods available in the literature, such as some newer methods by Kose 2018 [1], Sturtevant et al. 2018 [2], and Chen et at. 2017 [3]. For older search methods, see any AI textbook [12, 13]. However, For the purpose of this research, in the following sub-sections, we briefly described Breadth First Search (BFS) and A* search algorithms. After that, we will introduce a new approach to the search algorithms; we referred to it as Indexed Search. Finally, we will compare the implementation of fifteen puzzles with BFS and A* with the implementation of newer techniques; Indexed Breadth First Search (IBFS) and Indexed A* Search (IA*).

1.1 Breadth First Search (BFS)

The most regular form of the Breadth First Search (BFS) employs two lists for maintaining a solution called the Open and Closed List. The Open List is for storing the frontiers of a solution tree, and the Closed List is for storing the nodes, which have already been explored. There are two reasons to maintain the Closed List: first, to prevent duplication. While the search proceeds, cycles may occur. Most search algorithms employ a Closed List to prevent cycles and duplicate nodes. Secondly, the Closed List is used to build the solution paths after a goal node has been reached. As the BFS explores nodes in the order of their distance from the root, it generates nodes level by level from top to bottom until a solution has been found. As such, all the nodes at level j are expanded before the nodes at level j + 1 are expanded. BFS is guaranteed to find a solution if any solution is available [11]. It is also guaranteed to find the shortest path to the root. See any AI textbook for further information about BFS [4, 12, 13].

1.2 A* Search Algorithm

A* search algorithm is a well-known form of the Best First Search. The A* search minimizes the total estimated solution cost by using an evaluation function f(n).

$$ f(n)\, = \,g(n)\, + \,h(n) $$
(1)
  • g(n) is the actual cost of reaching a node n from a start node

  • h(n) is the heuristic estimation for getting from the node n to the goal

A* maintains two lists, which are called the Open, and Closed List. The Open List is a priority list that contains the frontier nodes that will be expanded in the next iteration. Before a node is inserted into the Open List, its weight is evaluated by a heuristic function f(n) as shown in Eq. 1. The difference between the estimated value and the real value is the error rate of h(n). The nodes in the Closed List are already expanded and removed from the Open List.

If the first element in the Open List is not a goal node, the algorithm generates its descendants. If an examined node is already in the Open or Closed List, the algorithm ensures to retain the one which leads to the shortest path. The A* algorithm stops searching if it meets the goal node or the Open List is empty. In the first case, the algorithm returns the solution path leading to a goal; in the second case, the algorithm returns no solution path.

If the A* algorithm employs an admissibleFootnote 1 heuristic function, then the A* algorithm will generate optimal solutions [8]. A* algorithm can solve problems which have a small state space search tree but for problems which have a large search domain, the A* algorithm faces a memory problem. There are many new forms of A* and BFS search algorithms and their applications contributed by Geethu [7], Ariel et al. [5, 6], and Zhou [15]. However, for the sake of our research, we will only investigate and implement the basic form of A* search. For further information about A* search, see Kopec [8], Zhang [14], Russell [13], and Korf [9, 10].

2 Indexed Search

The contribution of this research is a new type of search method that eliminates the overhead of saving the explored nodes, and also it constructs a solution path much easier than any regular AI search algorithms. We refer it Indexed Search. In the following sections, we will develop two forms of Indexed Search; but it can be applicable to any AI search algorithm.

Most AI search algorithms maintain a list structure to save the explored nodes; this list is usually called Closed List. There are two reasons to maintain the Closed List: first, to prevent the duplications and possible cycles. Secondly, the Closed List is used to build the solution path after a goal node has been reached.

The Indexed Search algorithm eliminates Closed List and provides some methods to resolve the two main reasons to save explored nodes. Instead of using a Closed List, our new approach uses indices (see Sect. 2.2) to generate a solution path after a goal node has been reached, it also uses some heuristics to avoid duplications and cycles (see Sect. 2.3). There are two main advantages of eliminating the Closed List:

  1. 1.

    Save space.

  2. 2.

    Simplify the process of generating a solution path.

If the possible node generation operators are well defined for some problem domain, then we can build a strategy, which prevents duplication. For example, for sliding puzzle problems, such as Fifteen Puzzle, a node may be generated as a result of Left, Right, Up or Down moves. If we could track this, we can prevent possible duplications and cycles that may loop back from deeper levels back to ancestors. For example as shown in Fig. 1, if Left move creates node n and a Right move of node n will cycle back to its parent node m. This is a simple cycle that can be prevents by any regular search algorithms. However, the loops that include more than three nodes would not be avoided without searching Open List and Closed List for possible duplicate nodes as shown in Fig. 2.

Fig. 1.
figure 1

Prevent looping back to parent node

Fig. 2.
figure 2

Loops may occurred in deeper levels.

The pure regular form of search methods, such as BFS and Depth First Search, does not track the information about how nodes are generated, and they demonstrate a lack in labelling the node generation. Indexed Search method tracks node moves by employing a parameter (see Sect. 2.1).

As we have described above, the policy of generating nodes without keeping track of how nodes are generated may create cycles. For example, if the child of a Left move is a Right move, then this will loop back to the parent as shown in Fig. 1. The search trees for the basic form of BFS and A* are undirected graphs. However, both methods eliminates cycles and duplications by searching through the Closed List for the newly generated node. If the newly generated node is in the Closed List, then the node will be discarded without adding it to the Open List. Otherwise it will be added to the Open List. Some forms of these methods check for duplicate nodes before including a node to the Open List. This mechanism adds a lot of search overhead when the search tree becomes larger. The second purpose of the Closed List is to build the solution path after a solution has been found. The algorithm does that by searching for ancestors of the examined node back from goal toward to start node. Once the start node reached, the method returns the solution path.

The Indexed Search algorithm employed two main mechanisms to prevent duplications and build a solution path. First, it tracks how a new node generated, and second it assigns an index to each new created node. These two parameters guide the search to avoid cycles and build the solution path easily. The following two subsections explore these two mechanisms.

2.1 New Node Generation

Indexed search employs a new parameter p for tracking how a node is generated from a parent node. For example, in Fig. 3, the node a is a result of a “1” move, so that for node a, the parameter p is “1”. The node b is a result of a 0 move, so that p for node b is “0”, etc.

Fig. 3.
figure 3

Label nodes with a parameter p.

The basic structure of a node N for Indexed Search is characterized by the 3-tuple {S, p, Index}, where:

  • S is the state of the node.

  • p is the parameter for tracking how the node generated.

  • Index is the location of the node in the Frontier List.

We do not keep track of the parent nodes. The parameter p of the nodes must be a legal move, and p must be an integer. For a problem P, if the maximum branching factor is b then p is: \( 0 \le p < b \). For example, for Fifteen Puzzle, the possible moves are {0, 1, 2, 3}, we can make the following assignment for parameters of the Fifteen Puzzle; {left ==0, right ==1, down ==2, up ==3}, so that the possible p values for fifteen puzzle are p = {0, 1, 2, 3}. If the branching factor of a problem X is six, then the possible p’s are {0, 1, 2, 3, 4, 5}, each generated state of X must be labeled with any of these values.

2.2 Creating Indices

The Frontier list is used to save the unexplored nodes. The index of a node is the location of the node in the Frontier List as illustrated in Fig. 4. For example, the first node in the list has index 0, and second node in the list has index 1 so on. We can calculate the index of a node N by the following formula.

Fig. 4.
figure 4

Indices of Nodes in the frontier list

$$ {\text{i}} = {\text{b}} \times ( {\text{j}} + 1 )+ {\text{p}} $$
(2)
  • i is the index of N (new generated node)

  • j is the index of parent of N

  • p is the parameter that we will assign to N

  • b is the branching factor of the problem.

For example, as shown in Fig. 3, the index of node a is 2, the branching factor of the problem is 4, then the index of node c is:

i = 4 × (2 + 1) + 1

i = 13.

We use indices to produce the solution path. Once the index of the goal node is known, the next step is to find the solution path. To generate the solution path, you must follow the following steps:

  1. 1.

    Convert the index of goal node to the base of maximum branching factor.

  2. 2.

    Switch each parameter p (described in the Sect. 2.1) to its dual move.

  3. 3.

    Reverse the final number to produce the solution path.

For example, for the Fifteen Puzzle, if the index of a goal node is (98645766)10, then to generate the solution path from start node toward the goal node, we must follow the above steps:

  1. 1.

    Convert the index from decimal to base 4 which is the branching factor of fifteen puzzle \( \left( { 9 8 6 4 5 7 6 6} \right)_{ 10} \equiv 1 1 3 20 10 3 1 300 1 2 \)

  2. 2.

    Switch Each Parameter P to Its Dual Move: 0023102021103

  3. 3.

    Reverse the final number: 3011202013200

Hence, the solution path for this problem will be; up, left, right, right, down, left, down, left, right, up, down, left, left.

How to define the dual movesFootnote 2 depends on the implementation and on the problem domain. Parameter p and their dual for Fifteen Puzzle are shown in the Table 1.

Table 1. Parameter p and their dual for fifteen puzzle

If the solution domain of a problem is a map from states to states, then each state in state space graph should have a coordinate or index in the map. We claim if we know the index of a goal node, then we can generate the solution path as we have described above. With Indexed Search, we have redefined the task of search algorithms; the new task is to find the index of a goal state on the map. Once we know the location of goal state in the map, then the problem solved by converting the index of goal state to the base of branching factor of the search tree. This policy will take care of building the solution path after a goal node is reached.

2.3 Prevent Duplications and Cycles

One of the purposes of retaining the parameter p (see Sect. 2.1) is to prevent duplications at shallow levels and the possibility of cycles that we have described above. However, it does not prevent cycles that might occur at deeper levels. In the Fifteen Puzzle example, the loops may occur at depths 1, 6, 7, 8, and so on. To prevent duplications, we employed the following steps:

  1. 1.

    Before adding a new generated node in to the Frontier List, we must search for its duplicated node in the Frontier List.

  2. 2.

    if its duplicated node is in the Frontier List then;

    1. a.

      Label the duplicated node with a new parameter

    2. b.

      discard the new generated node

  3. 3.

    If the duplication of the new node is not in the Frontier List, then add the new node in to the Frontier List.

This policy will prevent duplications. However, it will not prevent cycles. To avoid cycles, we employed the following strategy: If a node N is labeled with duplicated parameter (see the above steps), in the next iteration when N expanded, it will not generate its duplicated node. This heuristic prevents possible cycles For Example If a node N is already in the frontier list, and its duplication N’ is generated. The N’ would not be added to the frontier list because N with the same state is already in the Frontier List (see Fig. 5)

Fig. 5.
figure 5

Check Duplicate Nodes and Prevent Cycles. N is a new generated node, N’ is N’s duplication, M is child of N and parent of N’.

3 Algorithm Components

Indexed Search uses only one list to track the frontier nodes and the parameters that we have described in the above section. We called this list the Frontier List or Open list. We do not save explored nodes, any node that generated at depth 1, depth 2 … depth d-1 will be discarded, only nodes that are created at depth d will be saved for further investigation as shown in Fig. 6.

Fig. 6.
figure 6

Frontier List

In addition to these parameters, depending on the implementation, you can add some other parameters as well. For example, for one of our implementations we kept track of the depth of nodes and for preventing cycle we kept track of another parameter (see Sects. 2.1, 2.2).

In the purpose of experiments we have developed and implemented two form of Indexed search; Indexed Breadth First Search (IBFS) and Indexed A* Search (IA*). In the following two subsections we will briefly describe the steps of IBFS and IA* algorithms.

4 Indexed Breadth First Search (IBFS)

The IBFS algorithm works as follows: First, the start node is expanded, and the parameters for each child nodes are calculated. Before new nodes are added to the frontier list, we check for any duplication. If there is no duplication and the new node’s state is not a goal node, then we add the new node to the end of Frontier List. In the next iteration, the first element from Frontier List is examined. The procedure proceeds until a goal node is reached or the Frontier List is empty. The explored nodes are discarded. See Fig. 7 for IBFS algorithm’s Pseudocodes.

Fig. 7.
figure 7

Pseudocodes for IBFS. S is the Start State and G is the Goal State. Expand Methods creates new nodes and check for solution node.

In this section, we have shown that IBFS differs from BFS in that it uses two parameters to eliminate the Closed List employed by BFS. The IBFS algorithm does not track the Closed List as the BFS does. This policy saves memory and time. By tracking the index of frontiers, the IBFS algorithm eliminates two reasons for using the Closed List.

Now we will theoretically analyze the benefits of eliminating the Closed List.

Lemma 3.1:

IBFS algorithm uses less space and time then BFS.

Proof 3.1:

BFS has to save all the generated nodes; the explored nodes in Closed List and the nodes, which are generated but have not been explored in the Open List. When algorithm generates nodes at depth d, all the nodes at depth d-1 must be explored and saved in Closed List, so that the size of Closed List is:

$$ b^{0} + b^{1} + b^{2} \ldots b^{d - 2} + b^{d - 1} $$
(3)

The nodes that are generated at depth d, but have not been explored yet must be stored in Open List for future investigation so that the size of Open List is

\( b^{d} \). Hence, the total number of nodes generated are;

$$ b^{0} + b^{1} + b^{2} \ldots b^{d - 2} + b^{d - 1} + b^{d} $$
(4)

If x is \( b^{0} + b^{1} + b^{2} \ldots b^{d - 2} + b^{d - 1 } \)

Then

\( b^{0} + b^{1} + b^{2} \ldots b^{d - 2} + b^{d - 1} + b^{d} = x + b^{d} \) (by Eqs. 3 and 4)

Therefore, number of node stored by BFS is: \( x + b^{d} \).

IBFS is only storing frontier nodes in to the Frontier List which is same as the Open List employed by BFS. Therefore, the memory used by IBFS is: \( b^{d} \).

$$ \left( {b^{d} ) < (x + b^{d} } \right) $$
(5)

Hence, BFS use more space then IBFS (by Eq. 5).

Suppose storing the explored nodes uses x amount of memory in bytes, and it takes t amount of CPU time to store each node, then BFS will take x * t more time to solve a problem, because IBFS does not store the explored nodes. Therefore, IBFS uses less time to solve a problem. In addition to this fact, IBFS does not search through Closed List for duplicates; instead, it uses some heuristics to eliminate duplications and cycles. Hence, it will take less time to solve problems.

4.1 Indexed a* Search

The regular form of A* search algorithm is a kind of heuristic BFS algorithm, which employed a heuristic function to evaluate the nodes before inserting them into the Open List. The heuristic function of A* algorithm is defined in the Eq. 1. At each iteration, the node with minimum evaluation value is expanded. The nodes which are expanded and do not lead to a solution path move to a list called the Closed List, and the newly generated nodes move to a list called Open List. The purpose of the Closed List is to avoid duplicated nodes and to build the solution path after a goal node is reached. If the employed heuristic function is admissible, then the A* algorithm finds optimal path.

To build a regular form of Indexed A* search algorithm, we adopted the policy of IBFS instead of BFS. Therefore, the new form of A*algorithm is call Indexed A* search algorithm. The difference between IBFS and IA* is that, with IBFS the nodes are examined on a First In Last Out manner (FILO), but with IA* we save and expand nodes by evaluating their heuristic values, we always first examine the node which is closer to a goal node. See Fig. 8 for Pseudocodes for IA* search method.

Fig. 8.
figure 8

Pseudocodes for IBFS. S is the Start State and G is the Goal State.

5 Experimental Results and Implementations

We have implemented Fifteen Puzzle with Breadth First Search (BFS), Indexed Breadth First Search (IBFS), A*, and Indexed A* (IA*) Search Algorithms and compared the results. The following sub-sections demonstrate the results and implementation details.

6 Comparing IBFS with BFS

When we implement Fifteen Puzzles with IBFS we represent a node with 5-tuples (see Fig. 9).The node for IBFS includes the node depth, its state, parameter p, Duplication Parameter (dupFlag), and node Index as shown in Fig. 9. The parameter p is used for preventing first level duplication and creating child node index, the “DumFlag” variable is used for preventing deeper duplication and also used for pruning paths that are already generated. Index is used for creating solution path from goal node to start node.

Fig. 9.
figure 9

IBFS Node Representation of Fifteen Puzzles

BFS Implementation:

The node structure for BFS includes node depth, its parent node, and the state of the node as shown Fig. 10. For both implementations, the employed lists structures were linked lists. For BFS we have employed two linked lists called Closed List and Open List, the Closed List is used for keeping explored nodes and Open List is used for nodes that have been created but not explored yet. For IBFS we have employed one linked list called Frontier List, which is similar to Open List that we have used for BFS, it used for keeping nodes that created but have not explored yet.

Fig. 10.
figure 10

BFS Node Representation of Fifteen Puzzle

For the first implementation of BFS, we checked for duplicates by comparing the new created node with the nodes in the Closed List, if there is a duplication, we discard the new created node; otherwise, we add the new node to the Open list. With the second implementation of BFS, each new node was compared to both lists (Open and Closed) to check duplications. If an identical state was found, the less efficient path to this state was deleted and the most efficient path was kept in the appropriated list.

IBFS Implementation:

when we implement IBFS algorithm, we employed two policies to prevent duplication nodes of IBFS shown below: First, by using the move way parameter p, we prevented a node to generate its parent. For example if a Right move generates a Left move, this will loop back to the Right movie’s parent node. Whenever a repeated state has been found, we save the node which is closer to the goal. Secondly, we used “DupFlag” parameter to prevent loops (see Sect. 2.2). The “Index” variable is used to generate the solution path. The aim of IBFS algorithm is to find the index of a goal node into the frontier list. Once the index of goal node is found, we can generate a solution path by converting the index from decimal numbers to the base of the branching factor of the problem.

Table 2 shows the results of IBFS implementation with the feature of checking duplicate nodes. If a duplicate node is found, the best node is added to the Frontier List and the “DupFlag” variable is set for pruning the path to the duplicate node.

Table 2. IBFS’s performance and the size of frontier list by the solution depth. This version of IBFS employs a mechanism to check duplicate nodes.

Experimental results show that searching for duplicate nodes makes the BFS and IBFS slower than their respective versions, which do not check for repeating nodes. The size of the frontier list depends to the depth of the goal node. The time need to generate the solution is directly correlated to the time to search for duplicated nodes, and the time to create and save nodes.

Table 3 shows the results of IBFS by implementing the Fifteen Puzzle. With this implementation we excluded the duplication check policy. As you can see from Table 3 generating nodes becomes faster and the size of frontier list increases dramatically, but the problem is that there are a lot of duplicates.

Table 3. IBFS’s performance and the size of frontier list by the solution depth. This version of IBFS does not employ any policy to check duplicate nodes.

Tables 4 and 5 show the results of BFS by implementing the Fifteen Puzzle. The two main differences between our IBFS and regular BFS are that: the node representation of BFS is different from our IBFS, and secondly, BFS employs a Closed List to maintain the nodes already explored and do not lead to a goal node, IBFS employed only one list for frontier nodes and discards the explored nodes.

Table 4. BFS’s performance and the size of open and closed lists by the solution depth. This version of BFS employs a mechanism to check duplicate nodes in open and closed list.
Table 5. BFS’s performance and the size of Open and Closed Lists by the solution depth. This version of BFS employs a mechanism to check duplicate nodes only in the Closed List.

The results from Tables 2, 3, 4, 5, and 6 show that the IBFS is faster for solving same puzzle states, and it uses less memory by omitting the Closed List. BFS algorithms cannot explore deeper into the search tree. However, IBFS can search to deeper levels as well. If you omit the duplication node checking policy then IBFS can generate frontiers up to depth 21.

Table 6. Compare the space used for BFS and IBFS.

Table 6 and Chart 1 shows that BFS used more memory than IBFS to keep track of the nodes. If the goal node is in deeper layers then BFS is unable to find a solution path, it ran out of memory.

Chart 1.
figure 11

Compare total number of nodes stored by IBFS and BFS. The horizontal coordinate shows the depth of the goal node and the vertical coordinate shows the number of nodes generated.

When we apply IBFS and BFS to the same instance of Fifteen Puzzles, for some puzzles, which have a goal state in shallow levels, both algorithms find a solution very quickly. However, for puzzles, which have a goal state in deeper levels, IBFS finds a solution faster than BFS, and for some instances BFS was unable to find a solution at all, see Chart 2.

Chart 2.
figure 12

Performance of IBFS and BFS by applying to same instance of Fifteen Puzzles. The horizontal coordinate shows the puzzle instances, and the vertical coordinate shows the time to solve it in second.

6.1 Comparing IA* with A*

Implementation of IA* is similar to the IBFS. The only difference is that IA* tracks heuristic evaluation of a node and the frontier list is ordered by the heuristic evaluation values. The best node is added to the front of the list.

We only present experimental results for IA* and A* algorithms which allow duplications. If we add a duplication check mechanism when we try to solve some examples of the Fifteen Puzzle, the algorithms become extremely slow which is normal due to the fact that the A* algorithm usually is not employed to solve Fifteen Puzzles.

Results from Chart 3 show that; IA* and A* solve the puzzles that have a goal state at shallow levels approximately at the same amount of time, but if the goal state is on deeper levels, A* becomes extremely slow or cannot reach to goal state at all. However, IA* is able to solve some instance of this puzzles.

Chart 3.
figure 13

Compare the performance of IA* and A* algorithms. The horizontal coordinate shows the puzzle instances, and the vertical coordinate shows the time to solve it in second.

Results from Chart 4 shows IA* uses less memory than A* by completely eliminating the Closed List. And also A* is unable to solve some instances of fifteen puzzle however IA* can solve this instances.

Chart 4.
figure 14

Compares storage used by IA* and A* algorithms. The horizontal coordinate shows the puzzle instances, and the vertical coordinate shows the number of nodes generated.

The A* algorithms cannot solve Fifteen Puzzle Problems which have a goal node deeper than 50 moves. However IA* is able to solve some instances that have goal state deeper than 50 moves.

7 Conclusions and Future Works

We have introduced a new type of search algorithm that assigns an index to each node in its Frontier List, and after a goal node reached the algorithm covert the goal’s index from decimal to the base of branching factor of problem to find the solution path. We have defined and implemented two types of indexed search algorithms; Indexed Breadth First Search (IBFS) and Indexed A* (IA*) search methods. These newer techniques enable the elimination of the Closed List employed by the standard BFS and A* algorithms. Our experimental results show that IBFS algorithm generates a solution faster than the standard BFS. It also needs less memory than the BFS. Additionally, the IA* algorithm generates a solution slightly faster than the A* algorithm and it needs less space than the A*.

The main approach of traditional AI search methods to solve a problem is to build a search space by creating new nodes, and storing these nodes in data structures for further investigation. The search proceeds according to heuristic function and the steps of the algorithm. The search terminate once the goal node reached or assigned time expired to find a solution. The Indexed Search redefined the task of search algorithms; the new task is to find the index of a goal state. Each state in the state space assigned an index, the search terminates once the index of the goal node is found. The Indexed search also does not keep track of explored nodes.

In the future, we are planning to develop more versions of Indexed search algorithm such as Indexed Iterative deepening search and to compare the results with respective other type of algorithms. We also plan to apply Indexed Search to some other industrial problems.