WebSnatcher: WWW Prefetching and Caching
Students: Maria Gullickson, Catherine Eiccholz
Using Experience to Guide Web Server
Selection Maria L. Gullickson, Catherine E. Eiccholz, Ann
L. Chervenak and Ellen W. Zegura, to appear in Multimedia Computing
and Networking, January 1999.
One obvious use for the massive storage provided by the Personal
Terabyte is to prefetch and cache data likely to be needed by the user
to avoid network delays.
To experiment with prefetching data on the World Wide Web, we have
written an application called WebSnatcher. A WebSnatcher user creates
a profile reflecting his or her interests. This profile is composed
of web site locations, bookmark files from netscape, and keywords.
WebSnatcher initiates searches based on the user's specified keywords
on up to six commercial search engines, including Yahoo and Alta
Vista. After the searches are complete, WebSnatcher prefects the
pages that best match the keywords of the query. WebSnatcher also
prefects any other pages specified in the user profile. The
prefetching results are stored in a directory on the user's local file
system, providing fast display of the data without incurring network
delays and allowing indexing of the search results.
WebSnatcher softwrae and a Georgia Tech technical report describing
its design (GIT-CC-98-01) are available on the WebSnatcher Home
Page .
The latest version of WebSnatcher includes anycasting networking
technology. Ellen Zegura's anycasting work involves choosing one of a
set of equivalent servers on the network to satisfy a particular
request, with the choice of server made based on past performance. We
incorporate a variation of anycasting in the WebSnatcher application,
and study more than a dozen different algorithms for choosing a server
from a set of equivalent servers to handle a request from WebSnatcher.
Using a mechanism like anycasting in the context of WebSnatcher is
important for two reasons. First, it allows users to get better
interactive performance when fetching data that will be accessed soon
by picking a server that has historically provided good performance.
Second, prefetching by large numbers of people could generate large
amounts of network traffic. A mechanism like anycasting would allow
individuals to do such prefetching more responsibly, avoiding
heavily-loaded network paths and servers.