CS 2360 Lecture 10: February 9

State Space Searching

Touretzky Reading: None

Almost all of this material is lifted from Kurt Eiselt's notes on this subject.

At the end of the last lecture, we introduced a new type of search -- the state-space search. The state-space search is used in a lot of ways by lots of folks.

For example, a compiler has a component called a parser which decomposes a high-level instruction into its component parts. But these instructions can be ambiguous, so the parser must make decisions about how various symbols (known as "tokens" in compiler world) are being used. How that decision is made depends on what the parser has already seen; in other words, the next possible state of the parsing process depends on the history of the previous states.

The parser reads the input from left to right, making "guesses" as it goes. If the sequence of guesses leads to a structure for an instruction that's not legal, the parser will backtrack and systematically try new guesses, just like a depth-first search algorithm. If no combination of guesses works for the parser, you'll get a "syntax error" message. These things are sometimes called "recursive descent parsers."

Parsers are something that are a generally useful tool in a variety of contexts, so lets look at them in a little more detail. Consider a parser for a Pascal like language which gets a string of tokens to parse like this:

<a> <+> <b> <*> <c> <-> <1> Note: I put each token in greater-less-than pairs to show it more clearly. From a mathematics standpoint this clearly should result in a computation of the multiplication of b and c, followed by the addition of a and the subtraction of one. Since the tokens are scanned left to right, the parser is going to have to "backtrack" when it sees the <*> since it will have to "rethink" original guess of the addition of a and b. Further, when it reaches the one at the end, it might have to backtrack based on what's it original guess was about the minus sign. Note: one could (incorrectly?) interpret this set of tokens to be the multiplication of b and c, the addition of a, and an error since there is a constant just stuck on at the end (the -1). You can't just put 3.14 where the -1 is and have it work! Consider that you had a set of rules which defined what legal set of arithmetic expressions were (in order of importance) and just tried them, in order. Whenever your input token stream didn't match up to what you expected, you go back to the point where you made the previous guess and you try a different rule. Real compilers don't use this approach (as there are superior ones for parsing programming languages) but other programs do. Consider the following diagram:

What if you were interested in parsing this? Suppose you had a rule that said two objects with the same center point move together? What about rules for scaling and rotation?

The same sorts of ideas are used to get computers to understand English and other natural languages. In fact, an entire company was founded on this idea. A guy named Gary Hendrix at the University of Texas wrote a PhD thesis on parsing English back in the late 60's or early 70's. He later took some of those same ideas and build an interface to a simple database system -- an interface that could accept data base queries in English (or at least a subset of English). He called the whole thing "Q&A", it runs on PC compatibles, and it sells off the shelf at computer stores for about $300 a copy. This product was one of the first, if not the first, offered by the company Hendrix co-founded, which is called "Symantec" -- a company which most of you Mac or PC owners know about, since it's swallowing up all sorts of other software vendors. Hendrix is now a zillionaire, and the moral to this story is that state-space search can make you rich.

As we mentioned in class, evolutionary biologists think of all of us (and I mean *all* of us) as the bottom layers of nodes on a very big state space. Those of us who don't have any children are the leaves on a very very big tree (well, it's not exactly a tree, but you get the idea). Some of us will generate new states (our kids) and others of us won't. Each state presumably brings humanity slightly closer to some lifeform that is perfectly adapted to the environment. (If only we could get the environment to stop changing....)

Finally, a state-space search is a nice little metaphor for how we lead our lives: every decision we make is based on the chain of decisions leading up to that point. However, in life, unlike in your computer, there's often no backtracking possible when you make a bad decision.

A state-space search algorithm (depth-first)

Here's a very sketchy, high-level depth-first state-space search algorithm:

state-space (initial-states, goal-state, operators)

look at the first (leftmost) initial-state
if that state is the goal-state, then return success
if that state isn't the goal-state, then generate all possible new states from that state by applying the set of operators to that state
if there aren't any new states generated by applying those operators, then return failure
call state-space with this new list of states passed as the initial-states argument, and if that succeeds then return success else...
return failure

In step 3, you'd like to check all the new states to see if you've explored them before. You do that by keeping track of the sequence of states that was generated in going from the very first state to where you are now, and then comparing that list to the set of new states you just generated. If there are any duplicates, be sure to eliminate them from the set of new states.

Making your search smarter

Searches like what we've seen so far are, in a word, dumb. They don't know which next state might be any better than any other next state. These searches can be methodical (e.g., look at the first on the list) or random (e.g., Calvin's decision: "Arbitrarily, I choose left."). These searches settle for finding the goal state, but they don't care about how many steps it takes to get from the initial state to the goal state.

Usually, however, we don't have time to burn. We'd like to strive to find the goal state in as few steps as we can. That is, we'd like to try to find the "optimal path" from the initial state to the goal state, and we can help ourselves out here if we can put a little more "intelligence" into our search.

One time-honored way of doing this is to find a method to measure the "goodness" of a state -- that is, to determine how close a given state is to the goal state. If we could make that evaluation consistently and correctly, then when we look at a list of states in trying to decide which to use next to generate new states, we could pick the state closest to the goal, instead of just picking the first one we see or picking one at random.

Most of the time, though, such measurements of a state's goodness are just estimates. If the estimate is wrong, you could spend a lot of time and effort searching paths that will never get you to the goal, or at least that will give you less optimal solutions. The better the ability to estimate goodness, the better is the chance for optimality. But unless the estimate is always right, there's no guarantee of success. These measures of goodness are one example of something called "heuristics": techniques that aid in the discovery of solutions to problems most of the time, but don't guarantee that they'll lead to solutions all of the time.

A note to be aware of about the term "heuristics:" In general heuristics have the property that they do not make guarantees of success. If they did, they would be an "algorithm." Hueristics are approaches that trade certainty for certain optimizations, generally these are optimizations of speed or space. Consider a compiler that used hueristics to parse. Would you want to use a compiler that claimed that "most of the time, we compile a lot faster than our competitors?" Consider various "precompilation" techniques. The hueristic here is that "people compile the same things lots of times, so lets give up some space to have an already compiled version laying around." Note: If you end up not really compiling the same file often, you lose! All the space that is dedicated to the "precompiled" parts of the system go to waste.

Heuristics and the 8-tile puzzle

Let's look again at the 8-tile puzzle from the last lecture. There we described a dumb, exhaustive, brute-force, depth- first search for finding the goal state. Could you do better? Probably yes. If you could come up with a way to estimate how close any given arrangement of tiles was to the goal, you could always choose to explore the state that was nearest the goal. To do this, you'd have to figure out a way to codify the metrics for this evaluation in such a way that a computer could use them. One heuristic might be to just count the number of tiles that are in the place they belong. So if your goal state looks like this:

                  1 2 3
                  8   4
                  7 6 5

and your start state followed by the next possible states looks like this:


                  2 8 3
                  1 6 4
                  7   5
                   /|\
                  / | \
                 /  |  \
                /   |   \
               /    |    \
              /     |     \
             /      |      \
            /       |       \
          2 8 3   2 8 3   2 8 3
          1 6 4   1   4   1 6 4
            7 5   7 6 5   7 5

          score   score   score
            3       5       3

which of these next states is closer to the goal using our heuristic? The middle state has five tiles in the right place, while the other two states have only three tiles in the right place. So for our next step in the search, we'd choose to generate all the states possible from that middle state. Then we'd apply our evaluation heuristic again, and so on. Of course, we could get more sophisticated with our heuristic measures. For example, we could try to estimate how many moves it would take to get all the tiles in their appropriate places instead of just counting how many were already in the right place. That might give us a better measure of goodness, or it might just cause us to spend extra time computing the goodness without any real return on the investment, or it might just completely mislead the search. We'd have to play with it for awhile to see if it would help us.

Adversarial or game search -- the sneak preview

For a single agent in a relatively non-hostile world, the search for the path from some start state to some goal state is not especially difficult, and in fact, it's sort of dull. But the world isn't always peaceful---sometimes, there are other agents out there trying to keep you from reaching your goal, and at the same time those other agents are trying to achieve some goal of their own. We see this kind of stuff everywhere: in the executive boardroom, on the athletic field, sometimes even in the classroom. And when we try to model this kind of competitive behavior on a computer, we have to keep in mind that while our computerized "good guy" is going to try to move toward the goal in an optimal a fashion as possible, the computerized "bad guy" is going to do everything it can to keep us from getting there. Thus, the question in a competitive or adversarial situation is no longer "what's my optimal path to the goal?", but is instead "what's my path to the goal when someone else is trying stop me?"

The fundamental change in the nature of that question results in a fundamental change to the way we do state-space search in adversarial situations, thus giving rise to something called "adversarial search". And since we frequently use this kind of search when we build intelligent game-playing programs, this kind of search is frequently called "game search".

The principle of game search is to first generate the state space (or "game tree") some number of levels deep, where each level corresponds to one player's move (or, more accurately, the set of all moves that the player could possibly make at that point). After generating the state space for that number of levels, the nodes at the bottom level are evaluated for goodness. (In the context of game playing, those nodes are often called "boards", each one representing one possible legal arrangement of game pieces on the game board.)

The estimate of the goodness of a board is a little bit different than before, but not much. Since we have to worry about the opponent, we set up our estimation function so that it returns a spectrum of values, just like before, but now the two extremes are boards that are great for us (i.e., we win) and boards that are great for our opponent (i.e., we lose). We apply our estimation function to those lowest level boards, and propagate the numerical values upward to help us determine which is the best move to make.

But this is perhaps a bit sketchy. In the next lecture, we'll go into game search in great detail, and we'll introduce a simple game that is sure to drive you bonkers over the next couple of weeks.

Back To The CS2360 Home Page

Ian Smith (iansmith@cc.gatech.edu)