I worked from Kurt Eiselt's lecture notes on this subject.Now that we have all this new knowledge about representation in trees, hierarchical structures, networks, and the like, we need some means for exploring these knowledge structures to get at the information we want at the time we want it. How do we do this? The answer is a bunch of techniques which collectively fall under the heading of "search". Search is a concept which permeates computer science. We'll only touch on a couple of kinds of search in this course, but they'll be sufficient to demonstrate the basic difference between brute- force, exhaustive, or "dumb" search and heuristic or "intelligent" search.
We can impose a separate indexing scheme on our file structure, so that we can cut down on some search time. For example, we could apply a binary search mechanism to look for an employee record in a file. If the employee's name starts with a letter in the range A-M, we could start the search at the beginning of the file, but if the name starts with the letter N-Z, we would start the search at approximately the midway point in the file. We could continue to divide the big groups into smaller groups, until eventually the time to find a single record is governed not by the behavior of the linear search but by the behavior of the binary search. There are other indexing mechanisms that we could use, such as hashing functions, that would give us different kinds of advantages.
Rocky
/ \
/ \
has-mom / \ has-dad
/ \
\/_ _\/
Pebbles Bam-Bam
/ \ has-dad / \
has-mom / \ has-mom / \ has-dad
/ \ / \
\/_ _\/ \/_ _\/
Wilma Fred Betty Barney
In structures like this, as before, we may want to search for
useful information. But structures like this, unlike linear
file structures, make it easier to search for the answers to
questions like "What's the relationship of Barney to Rocky?"
or "Who is Rocky's grandfather on his mother's side?"
df-search
preorder
If you can adapt this procedure to a-lists, and you can figure out how to avoid searching infinitely through cycles in a network, then you've pretty much got the associated homework assignment wired.
At a very low level, the state of a process is described by the values of the arguments being passed, the instruction being executed, and if you're programming with side effects, the bindings of variables to values. (Obviously, it's easier to describe the state of a process if you don't have to worry about side effects, as there's just that much less to keep track of.) However, thinking about state at this low level becomes very tedious very quickly. So, we might be better off using a higher-level abstraction in thinking about the state of a process. Consider, for example, a program to solve the 8-tile puzzle. Instead of thinking in terms of which instruction is being executed, the values bound to arguments, and so on, we can look at the process in terms of the state of the puzzle itself. Thus, the initial state of the process would be the initial state of the puzzle. Say the initial state looks like this:
2 8 3
1 6 4
7 5
We could move any of three tiles, the 7, the 6, or the 5, to
generate the three possible next states from this one:
2 8 3
1 6 4
7 5
/|\
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
2 8 3 2 8 3 2 8 3
1 6 4 1 4 1 6 4
7 5 7 6 5 7 5
If we then choose, say, the lower leftmost state of those
three new states, and generate the two possible next states
from that one, we get this:
2 8 3
1 6 4
7 5
/|\
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
2 8 3 2 8 3 2 8 3
1 6 4 1 4 1 6 4
7 5 7 6 5 7 5
/ \
/ \
/ \
/ \
/ \
2 8 3 2 8 3
6 4 1 6 4
1 7 5 7 5
Note that one of these new states is just a repeat of the
initial state. We wouldn't want to explore that direction
any further, because we'd just be doing work we've already
done. Now, if we think of the movement of tiles as the significant operations in this process, we can describe the history of the process in terms of puzzle boards and the operations necessary to get from one board to the next. And since the nature of the operations in this case are such that only at most four new boards can be generated from any given board, we can safely say that the current behavior of the process depends on its history--the process couldn't have been in the current state without having just been in one of a very few previous states.
If we keep applying operations (i.e., moving tiles) to the leftmost board in the tree, we're going to get a depth-first search. But we're not searching some pre-existing data structure; instead we're searching something that's being "built" as the program executes. This something is called a "state space" (or a "problem space"), and our hypothetical 8- tile program is performing a "state-space search" by following a depth-first search algorithm.
A state-space is defined as the set of all possible states generated by the repeated application of a finite set of operations to some initial state. In performing a state- space search, the intention is usually to find a sequence of operations that gets one from the initial state to some goal state. In the case of the 8-tile puzzle, that goal state might be:
1 2 3
8 4
7 6 5
Why generate the state space at run-time, and not just have
it all built in advance? For some applications, that might
not be much of a problem. For example, in the 8-tile puzzle,
the number of different ways to arrange the tiles isn't
overwhelming. On the other hand, if you were working on a
program that could play a decent game of chess, and you
wanted to pre-build a data structure that was comprised of
all possible boards, you'd want to make sure that you set
aside a little disk space to store the approximately 10^120
(i.e., 1,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000) different
boards that are possible. Or maybe you'd be better off
writing your program to generate just those boards that were
relevant to the specific chess game it was playing at that
particular time, and not worry about the rest of them.