DRAFT
The Information Mural
Dean F. Jerding and John T. Stasko
Graphics, Visualization, and Usability Center
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280
{dfj,stasko}@cc.gatech.edu
Overview article
March 1996
Abstract
Information visualizations must allow users to browse information
spaces and focus quickly on items of interest. Being able to see some
representation of the entire information space provides an initial gestalt
overview and gives context to support browsing and search tasks. However,
the limited number of pixels on the screen makes it difficult to completely
display large information spaces. The Information Mural is a
two-dimensional, reduced representation of an entire information space that
fits entirely within a display window or screen. The mural creates a
miniature version of the information space using visual attributes such
as grayscale shading, intensity, color, and pixel size, along with
anti-aliased compression techniques. Information Murals can be used as
stand-alone visualizations or in global navigational views.
Keywords: information visualization, software visualization, data visualization
1 Information Murals
Although large quantities of information are becoming available
on-line, the information itself is useless without effective display
and access mechanisms. Effective presentations of the information
must be created using only the limited number of pixels on the screen.
The Information Mural technique allows 2D visual representations of
large information spaces to be created even when the number of informational
elements greatly outnumbers the available pixels. Current methods
for depicting such large information spaces typically utilize abstraction,
overplotting, or sampling to create a view of the entire space. However,
all of these techniques result in a loss of information that might be useful
to the observer.
The goals of our technique can be summarized as follows:
- Create a representation of an entire (large) information space that fits
completely within a display window or screen.
- Mimic what the original visual representation of the information would
look like if it could be viewed in its entirety, ie. containing the same
visual patterns.
- Minimize the loss of information in the compressed view.
Information Murals allow global views of large information spaces to be
constructed. Such contextual information directly supports analytical and
navigational tasks that a user performs while interacting with informational
displays. There are several different types of information spaces
which could be represented using information murals:
- A text file or document usually does not fit entirely on the screen,
because its vertical dimension far exceeds its horizontal dimension.
Typically, a text editor displays only a portion of the file being edited.
- Graphs of data often require some compression technique to fit
on the screen. Scaling and rounding of data values is often necessary to
draw the entire graph. Other alternatives are to display an average
of the data values, or only a subset of the data.
- Program visualizations often span many computer screens if laid
out completely. This is especially true for those views where one
dimension corresponds to time.
- Images might be represented using Information Murals. Although
an image usually fits on a screen, it is often desirable to change the
size of the image. As an image is shrunk, information in the image is
inevitably lost.
1.1 Technique Overview
Imagine some visual representation of
a large information space, made up of distinct elements each with
their own representation. An Information Mural of this information is to
fit in some area of i x j pixels; assume there is a ``bin''
associated with each pixel. The position of each information element is
first scaled to fit into the available space. As each element is ``drawn''
in the mural using an imaginary pen, different amounts of ``ink'' fall into
different bins, in a manner similar to anti-aliasing strategies in computer
graphics. As each subsequent element is drawn, the amount of ink will build
up in different bins, depending on the amount of overlap of the elements.
The resulting Information Mural is created by mapping the amount of
ink in each bin (the information density) to some visual attribute. In a
grayscale mural, the shade of each pixel corresponds proportionally to the
amount of ink in each bin. Instead of using grayscale variation, an
equalized intensity variation over the entire color scale can also be used.
With the ``raindrop'' mural, the amount of ink in each bin makes a ``puddle''
centered around that pixel, so pixels with more ink will appear
larger. Color can then be added to the mural to convey other
attributes of the informational elements, while still preserving the
density mapping.
Information Murals of certain information spaces may be inappropriate.
The distribution of information in the original image may be such that
a useful information mural cannot be created. For example, a
grayscale mural showing a graph of a symmetric function with a short
period will be a dark bar with a thickness equal to the amplitude of
the data.
1.2 Algorithm
The basic algorithm for creating an Information Mural is listed below.
The algorithm takes an image of M x N elements and scales it into a
mural of I x J pixels. In addition to the data structures which
store the information, the algorithm requires an I x J array of
floats. The algorithm listed below does not handle attribute colors.
1) for each i,j set mural_array[i][j] to zero
2) for each element m,n of information
a) compute x = m / M * I, y = n / N * J
b) determine the proportion of this point that lies in each of
the four surrounding mural_array entries (totals to 1.0):
mural_array[floor(x)][floor(y)]
mural_array[floor(x)][ceil(y)]
mural_array[ceil(x)][floor(y)]
mural_array[ceil(x)][ceil(y)]
c) add each of the proportions determined in the previous step to the
existing values of each corresponding mural_array entry
i) update max_mural_array_value to keep track of the
maximum mural_array[][] value
3) for each i,j in the mural_array
a) map the value mural_array[i][j] / max_mural_array_value
to a grayscale or color intensity varying scale, or to pixel size,
depending on the type of mural being created
b) color and draw the pixel at i,j of the mural based on mapping
computed in the previous step
For improved efficiency, steps 2b and 2c can replaced by a
single step which adds 1.0 to mural_array[floor(x)][floor(y)] and
updates the max_mural_array_value. This avoids having
to compute a number of floor's and ceil's and the percentages lying in each
surrounding pixel, effectively eliminating the anti-aliasing
aspect from the mural. For many applications, the gain in performance from
using the aliased mural outweighs any slight changes in appearance.
We considered two alternative ways attribute colors could be added to
an Information Mural. Before
discussing the positives and negatives of each approach, it should be
recognized that bandwidth limitations imposed by each pixel mean that
the mural may not be able to show attribute colors for every piece of
data. For example, if the mural compresses 50 points into the same
pixel, 5 of which are blue, 13 red, 6 yellow, and so on, how should
that pixel be rendered? It does not make sense to mix rgb values,
because an observer might be confused if equal parts of red
and green data values make a yellow pixel. Thus, we choose to color each
pixel according to the attribute color that occurs most frequently at
that point in the mural.
One way to compute this would be to keep track of the intensity for
each color separately, requiring a mural_array of floats for each
different attribute color. Note that just keeping a red, green, and
blue array would not work, because colors should not be mixed for the reason
mentioned above. Besides the large space requirements, another
problem is which maximum intensity value should be used to
compute the resulting shade. The maximum for the resulting pixel
color? The maximum of all colors? The only way that really makes
sense is to treat the intensity at each pixel uniformly (cumulatively
over all colors), and compute the mapping with respect to the maximum
of those intensities as is done in the basic algorithm.
This leads to the alternative for computing attribute colors that we
have chosen to implement. To reduce space requirements, a single
mural_array of floats is used to keep track of overall
information density at each pixel. A list of shorts, one for each
possible attribute color, is kept with each mural_array entry to
record how many points of each attribute color have been drawn. The
tradeoff here is that in keeping a single intensity value and a count
of colors, we could end up with an inaccurate reflection of exactly
how much of the intensity is due to each color. For example, five blue
points of 0.1 intensity and one red point of 1.0 intensity would
result in a blue pixel of 1.5 intensity. This problem only arises
in building an anti-aliased mural, because when anti-aliasing
is not done all of the points are drawn with the same intensity.
2 Applications
Information Murals can be used as global views of information spaces.
Without a good visual representation, a global view cannot serve
as an effective navigation tool. Furthermore, the usefulness of a
visualization tool often depends on the effectiveness of its navigation
capabilities: Can the user navigate quickly to locate an area of particular
interest? Used as a background in a navigational widget, murals provide
informational context to support panning and zooming of more detailed focus
views. By adding panning and zooming within the global view itself, an
Information Mural can be used as a stand-alone visualization.
Below are some snapshots from visualization applications we have built
using Information Murals. These applications contain many different forms of
information, from software to data to text documents, some of which
were mentioned in [JS95b].
2.1 Software Visualization
The Information Mural technique originated in our software
visualization research into visualization of object-oriented (OO) program
executions[JS95a]. Murals are currently being used in a suite of
views to support program understanding during design recovery,
validation, and reengineering tasks.
2.1.1 Object-Oriented Message Traces
Imagine an event trace diagram for object-oriented message sequences
turned on its side, such that classes are assigned rows on the
vertical axis and a message from one class to another is drawn as a
vertical line connecting the source and destination classes. The
horizontal axis then represents time, or the sequence of messages.
Now imagine that you could see an event trace diagram of an entire
program execution, which might contain hundreds of thousands of
messages. Figure 1a is a grayscale, aliased
Information Mural of a message trace from a bubble sort algorithm
animation built using the Polka toolkit[SK93], containing
around 20 classes on the vertical axis and over 50,000 messages on the
horizontal. Drawing this image in a window 500 pixels wide results in
a horizontal information compression ratio of over 100:1. For
comparison, the same representation without the mural technique is
shown in Figure 1b.
(a)
(b)
Figure 1: (a) Mural of object-oriented message trace of over 50,000
messages, drawn in an area 500 pixels wide. (b) Same diagram drawn by just
overplotting (without the mural technique).
One of the views from our prototype OO program visualization suite is
called the Execution Mural (Figure 2). This view
is used to examine message traces from object-oriented
programs[JS95a]. The upper portion of the view is the focus area
where a sub-set of the messages can be examined in detail. The bottom
portion of the view is a navigational area that includes a mural of
the entire message trace, and a navigation rectangle indicating where
the focus area fits in the entire execution. Notice that the color of
several different messages has been set in the focus area. The
Information Mural technique allows the coloring of information
attributes using shaded color scales, as is evident by the colored
areas in the mural. The mural gives a quick insight into various
phases in the execution, including very repetitive patterns. The
coloring allows the location of particular messages in the execution
to be identified.
Figure 2: Execution Mural view of bubble sort algorithm animation
built using the Polka animation toolkit.
2.1.2 Parallel Processor Message Passing
Visualizations of the message passing during executions of programs on
parallel architectures become very unclear when long durations of time
are shown. The aliased mural of Figure 3a shows the kernel
integer sort benchmark executing on 16 processors. Each processor is
assigned a row on the vertical axis, and a message is drawn as a line
from one processor to another at the appropriate time coordinates.
This particular view uses wall clock timestamps. As is evident from
the traditional representation shown if Figure 3b, the
mural gives a much better resolution to the image.
(a)
(b)
Figure 3: (a) Mural of parallel program message trace. (b) Same
diagram drawn by just overplotting (without the mural technique).
As was done in the Execution Mural view, a mural can be used in the
background of a global overview to allow more detailed examination of the
message passing. Figure 4 shows the same message trace, this
time with messages colored according to message type. The global
overview provided by the mural gives an immediate indication of the
phases and sub-phases of the algorithm, as well as showing anomalies
such as network blockage or processors waiting for others to
complete.
Figure 4: View of message passing in kernel integer sort parallel
processor benchmark, with focus area and global overview created using
the Information Mural technique.
2.2 Data Visualization
The Information Mural technique is useful for revealing the underlying
density of data while viewing very large data sets. Traditional
plotting techniques typically overplot points that happen to lie in
the same pixel. Our technique shows the actual density of the
information. Incorporated into a data visualization, murals can
support one- or two-dimensional navigation through large data spaces.
2.2.1 Sun Spots
Astronomers have been recording the number of sun spots since the
1700s. Because this is such a large dataset, it is typically plotted
by showing the monthly averages. Figure 5 is a plot
of the average number of sun spots per month recorded from 1850-1993.
This data was obtained from the StatLib
server at Carnegie Mellon University.
Figure 5: Plot of average number of sun spots recorded per month, 1850-1993.
Using the Information Mural technique, we do not have to worry about
the size of the dataset. Figure 6 shows an anti-aliased mural of the
number of sun spots recorded daily from 1850-1993, over 52,000
readings. Instead of using grayscale to depict density, a color scale
which goes from dark blue (lowest data density) to bright white
(highest data density) is used because it is easier to see outliers
using color.
Figure 6: Mural of the number of sun spots recorded daily, 1850-1993.
Plotting statistics such as averages is commonly done to analyze large
amounts of data. However, in the monthly view we do not see the band
of "missing" values between zero and about 10, nor do we notice that a
large number of zero values were recorded (bright spots at bottom of
Figure 6).
With the stand-alone Information Mural views, it is also possible to
incrementally zoom in on sections of the mural or to sweep out a
rectangle to zoom. Figure 7 shows the sun spot
mural zoomed in on a small area. Figure 8 shows
how the mural of the entire data set can be placed in the background
of a slider, giving context to a more detailed view of the data.
Figure 7: Mural of the number of sun spots recorded daily, 1850-1993,
zoomed in on a small area.
Figure 8: View of sun spots showing focus area and mural of entire
data set at the bottom.
2.2.2 River Flow Data
Another interesting large data set is the mean daily Saugeen river
flows, from Jan 1, 1915 to Dec 31, 1979. The anti-aliased mural of
this data shows a periodic pattern, with concentrations at the lower
values. Some bright spots occur seemingly randomly in the lower portion of the
mural shown in Figure 9a. Zooming in on a small area at
the bottom, we find that the bright spots in the mural are due to
single values that occur repetitively (Figure 9b). We
hypothesize that these might be weeks or months in the data where a
single value was extrapolated across the entire period to create the
daily values.
(a)
(b)
Figure 9: (a) Mural of the mean daily river flow rates of the Saugeen
river, 1915-1979. (b) Part (a) zoomed on small area at the bottom of
the mural.
2.2.3 Automobile Data
The Information Mural technique can be used to create parallel
coordinate data displays. A data set from the Committee on
Statistical Graphics of the American Statistical Association (ASA)
Second (1983) Exposition of Statistical Graphics Technology contains
406 observations on the following 8 variables: MPG (miles
per gallon), number of cylinders, engine displacement (cu. inches),
horsepower, vehicle weight (lbs.), time to accelerate from O to 60 mph
(sec.), model year (modulo 100), and origin of car (1. American,
2. European, 3. Japanese). Figure 10a shows a parallel
coordinate mural of a subset of the data, including MPG, displacement,
horsepower, weight, acceleration, and model year. Part (b) of
Figure 10 shows the standard parallel coordinate view without
the mural. In Figure 10c, color has been overlaid on the
mural according to the number of cylinders attribute. Notice how the
data tuples with fewer cylinders tend to have higher MPG, smaller
displacement, less horsepower, and longer acceleration times.
(a)
(b)
(c)
Figure 10: (a) Mural of a parallel coordinate view of automobile data
showing MPG, engine displacement, horsepower, weight, acceleration, and
model year (1970-1982). (b) Standard parallel coordinate view of the
data. (c) Color overlaid for number of cylinders (3 = red, 4 =
orange, 5 = yellow, 6 = green, 8 = cyan).
2.3 Information Visualization
Many other forms of information can be displayed using Information
Murals. Two such applications, geographic data and text documents,
are described below.
2.3.1 Geographic Information
The U.S. Census Bureau creates maps of various census statistics such
as population distributions. While their techniques work well for
wall-sized maps, the overwhelming scale reduction to display the
information on a computer screen causes their algorithm to produce
inaccurate results. The Information Mural technique computes
information density automatically, making the display of a population
density map on a computer screen almost trivial (Figure 11). The data was
obtained from the Tiger Mapping Service U.S. Places File, created from
the Census file STF-1A.
Figure 11: Mural of population density distribution, using data from
the 1990 census.
2.3.2 Text Documents
While SeeSoft[ESJ92] from AT&T's Bell Laboratories introduced a
revolutionary miniature representation for text documents, it did have
a limit. One row of pixels (or part of a row in later versions) was
required for every line in the file. The Information Mural technique
can go beyond this limit, allowing many lines in a file to map to a
single row of pixels in the miniature representation. On top of a
grayscale mural representation of a document, color can be used to
indicate attributes of the text, such as comments, sections, or
keywords.
Figure 12 is a sample text editor with a mural in the
background of the scrollbar. Color is used to indicate sections in
the Latex document being browsed. The mural is constructed by
examining the position of each character in the file, scaling that
position into the scrollbar, and mapping the resulting density of
characters to the intensity scale.
Figure 12: Text editor containing Latex document. Mural of the entire
file is shown in the background of the scrollbar, with text shaded
according to section.
Information Murals can also be used to visualize the distribution of
keywords in a set of documents retrieved from a search.
Figures 13a-c show the distribution of keywords in three
papers after a search for visualization (yellow), object-oriented (green), and OO (cyan) was performed.
Figure 13: Murals showing keyword distribution for search on
``visualization'' (yellow), ``object-oriented'' (green), and ``OO''
(cyan) in three documents.
The document in Figure 13a seems to be about
visualization, and talks a little about object-oriented stuff in the
beginning. Figure 13b talks about both visualization and
object-oriented throughout the document, and Figure 13c
discusses object-oriented and visualization in the beginning and in
the end.
3 Conclusion
An Information Mural is a 2D, graphical representation of a large
information space which fits entirely within a display window or
screen. The miniature representation is drawn using anti-aliasing
techniques and intensity shading or varying pixel size, and is useful for
visualizing trends and patterns in the overall distribution of information.
By adding panning and zooming capabilities to an Information Mural, they can
be used as stand-alone visualizations or as global views along with more
detailed informational displays.
The techniques used to create Information Murals can be integrated into
various information visualization applications to help display large
information spaces. In browsing information or examining a large data set,
it is always useful to start with a global overview of the information.
Information Murals convey more information about large data sets than
traditional techniques, and allow overviews of certain types of
information spaces
to be created when before they could not. Another advantage of the
Information Mural technique is that the application need not concern itself
with how much space is available to render the information--the density
mapping is computed automatically based on the available space.
For More Information
Please see the Information Interfaces research group page about the
Information Mural for subsequent papers and more information about the
technique.
References
- [ESJ92]
- Stephen G. Eick, Joseph L. Steffen, and Eric E. Sumner Jr.
SeeSoft---A tool for visualizing line oriented software statistics.
IEEE Transactions on Software Engineering, 18(11):957--968, November 1992.
- [JS95a]
- Dean F. Jerding and John T. Stasko.
The Information Mural: A technique for displaying and navigating large information spaces.
In Proceedings of the IEEE Visualization `95 Symposium on Information Visualization, pages 43-50, Atlanta, GA, October 1995.
- [JS95b]
- Dean F. Jerding and John T. Stasko.
Using Information Murals in visualization applications.
In Proceedings of the 1995 Symposium on User Interface Software and Technology (Demonstration), pages 73-74, Pittsburgh, PA, November 1995.
- [JSB97]
- Dean F. Jerding, John Stasko and Thomas Ball. Visualizing
Interactions in Program Executions. Proceedings of the 1997
International Conference on Software Engineering (ICSE-97), pages
360-370, Boston, MA, May 1997.
- [JS98]
- Dean F. Jerding and John Stasko. The Information Mural:
A Technique for
Displaying and Navigating Large Information Spaces IEEE
Transactions on Visualization and Computer Graphics, 4(3):257-271,
July-Sept. 1998.
- [SK93]
- John T. Stasko and Eileen Kraemer.
A methodology for building application-specific visualizations of parallel programs.
Journal of Parallel and Distributed Computing, 18(2):258--264, June 1993.
For More Info contact Dean Jerding (email: dfj@cc.gatech.edu).
|
|
|
|
|