An Overview of the Information Mural
Dean Jerding, John Stasko
Home | Details

Abstract

Information visualizations must allow users to browse information spaces and focus quickly on items of interest. Being able to see some representation of the entire information space provides an initial gestalt overview and gives context to support browsing and search tasks. However, the limited number of pixels on the screen makes it difficult to completely display large information spaces. The Information Mural is a two-dimensional, reduced representation of an entire information space that fits entirely within a display window or screen. The mural creates a miniature version of the information space using visual attributes such as grayscale shading, intensity, color, and pixel size, along with anti-aliased compression techniques. Information Murals can be used as stand-alone visualizations or in global navigational views.

Keywords: information visualization, software visualization, data visualization

1 Information Murals

Although large quantities of information are becoming available on-line, the information itself is useless without effective display and access mechanisms. Effective presentations of the information must be created using only the limited number of pixels on the screen. The Information Mural technique allows 2D visual representations of large information spaces to be created even when the number of informational elements greatly outnumbers the available pixels. Current methods for depicting such large information spaces typically utilize abstraction, overplotting, or sampling to create a view of the entire space. However, all of these techniques result in a loss of information that might be useful to the observer.

The goals of our technique can be summarized as follows:

  • Create a representation of an entire (large) information space that fits completely within a display window or screen.
  • Mimic what the original visual representation of the information would look like if it could be viewed in its entirety, ie. containing the same visual patterns.
  • Minimize the loss of information in the compressed view.

Information Murals allow global views of large information spaces to be constructed. Such contextual information directly supports analytical and navigational tasks that a user performs while interacting with informational displays. There are several different types of information spaces which could be represented using information murals:

  • A text file or document usually does not fit entirely on the screen, because its vertical dimension far exceeds its horizontal dimension. Typically, a text editor displays only a portion of the file being edited.
  • Graphs of data often require some compression technique to fit on the screen. Scaling and rounding of data values is often necessary to draw the entire graph. Other alternatives are to display an average of the data values, or only a subset of the data.
  • Program visualizations often span many computer screens if laid out completely. This is especially true for those views where one dimension corresponds to time.
  • Images might be represented using Information Murals. Although an image usually fits on a screen, it is often desirable to change the size of the image. As an image is shrunk, information in the image is inevitably lost.

1.1 Technique Overview

Imagine some visual representation of a large information space, made up of distinct elements each with their own representation. An Information Mural of this information is to fit in some area of i x j pixels; assume there is a ``bin'' associated with each pixel. The position of each information element is first scaled to fit into the available space. As each element is ``drawn'' in the mural using an imaginary pen, different amounts of ``ink'' fall into different bins, in a manner similar to anti-aliasing strategies in computer graphics. As each subsequent element is drawn, the amount of ink will build up in different bins, depending on the amount of overlap of the elements.

The resulting Information Mural is created by mapping the amount of ink in each bin (the information density) to some visual attribute. In a grayscale mural, the shade of each pixel corresponds proportionally to the amount of ink in each bin. Instead of using grayscale variation, an equalized intensity variation over the entire color scale can also be used. With the ``raindrop'' mural, the amount of ink in each bin makes a ``puddle'' centered around that pixel, so pixels with more ink will appear larger. Color can then be added to the mural to convey other attributes of the informational elements, while still preserving the density mapping.

Information Murals of certain information spaces may be inappropriate. The distribution of information in the original image may be such that a useful information mural cannot be created. For example, a grayscale mural showing a graph of a symmetric function with a short period will be a dark bar with a thickness equal to the amplitude of the data.

1.2 Algorithm

The basic algorithm for creating an Information Mural is listed below. The algorithm takes an image of M x N elements and scales it into a mural of I x J pixels. In addition to the data structures which store the information, the algorithm requires an I x J array of floats. The algorithm listed below does not handle attribute colors. 1) for each i,j set mural_array[i][j] to zero 2) for each element m,n of information a) compute x = m / M * I, y = n / N * J b) determine the proportion of this point that lies in each of the four surrounding mural_array entries (totals to 1.0): mural_array[floor(x)][floor(y)] mural_array[floor(x)][ceil(y)] mural_array[ceil(x)][floor(y)] mural_array[ceil(x)][ceil(y)] c) add each of the proportions determined in the previous step to the existing values of each corresponding mural_array entry i) update max_mural_array_value to keep track of the maximum mural_array[][] value 3) for each i,j in the mural_array a) map the value mural_array[i][j] / max_mural_array_value to a grayscale or color intensity varying scale, or to pixel size, depending on the type of mural being created b) color and draw the pixel at i,j of the mural based on mapping computed in the previous step

For improved efficiency, steps 2b and 2c can replaced by a single step which adds 1.0 to mural_array[floor(x)][floor(y)] and updates the max_mural_array_value. This avoids having to compute a number of floor's and ceil's and the percentages lying in each surrounding pixel, effectively eliminating the anti-aliasing aspect from the mural. For many applications, the gain in performance from using the aliased mural outweighs any slight changes in appearance.

We considered two alternative ways attribute colors could be added to an Information Mural. Before discussing the positives and negatives of each approach, it should be recognized that bandwidth limitations imposed by each pixel mean that the mural may not be able to show attribute colors for every piece of data. For example, if the mural compresses 50 points into the same pixel, 5 of which are blue, 13 red, 6 yellow, and so on, how should that pixel be rendered? It does not make sense to mix rgb values, because an observer might be confused if equal parts of red and green data values make a yellow pixel. Thus, we choose to color each pixel according to the attribute color that occurs most frequently at that point in the mural.

One way to compute this would be to keep track of the intensity for each color separately, requiring a mural_array of floats for each different attribute color. Note that just keeping a red, green, and blue array would not work, because colors should not be mixed for the reason mentioned above. Besides the large space requirements, another problem is which maximum intensity value should be used to compute the resulting shade. The maximum for the resulting pixel color? The maximum of all colors? The only way that really makes sense is to treat the intensity at each pixel uniformly (cumulatively over all colors), and compute the mapping with respect to the maximum of those intensities as is done in the basic algorithm.

This leads to the alternative for computing attribute colors that we have chosen to implement. To reduce space requirements, a single mural_array of floats is used to keep track of overall information density at each pixel. A list of shorts, one for each possible attribute color, is kept with each mural_array entry to record how many points of each attribute color have been drawn. The tradeoff here is that in keeping a single intensity value and a count of colors, we could end up with an inaccurate reflection of exactly how much of the intensity is due to each color. For example, five blue points of 0.1 intensity and one red point of 1.0 intensity would result in a blue pixel of 1.5 intensity. This problem only arises in building an anti-aliased mural, because when anti-aliasing is not done all of the points are drawn with the same intensity.

2 Applications

Information Murals can be used as global views of information spaces. Without a good visual representation, a global view cannot serve as an effective navigation tool. Furthermore, the usefulness of a visualization tool often depends on the effectiveness of its navigation capabilities: Can the user navigate quickly to locate an area of particular interest? Used as a background in a navigational widget, murals provide informational context to support panning and zooming of more detailed focus views. By adding panning and zooming within the global view itself, an Information Mural can be used as a stand-alone visualization.

Below are some snapshots from visualization applications we have built using Information Murals. These applications contain many different forms of information, from software to data to text documents, some of which were mentioned in [JS95b].

2.1 Software Visualization

The Information Mural technique originated in our software visualization research into visualization of object-oriented (OO) program executions[JS95a]. Murals are currently being used in a suite of views to support program understanding during design recovery, validation, and reengineering tasks.

2.1.1 Object-Oriented Message Traces

Imagine an event trace diagram for object-oriented message sequences turned on its side, such that classes are assigned rows on the vertical axis and a message from one class to another is drawn as a vertical line connecting the source and destination classes. The horizontal axis then represents time, or the sequence of messages. Now imagine that you could see an event trace diagram of an entire program execution, which might contain hundreds of thousands of messages. Figure 1a is a grayscale, aliased Information Mural of a message trace from a bubble sort algorithm animation built using the Polka toolkit[SK93], containing around 20 classes on the vertical axis and over 50,000 messages on the horizontal. Drawing this image in a window 500 pixels wide results in a horizontal information compression ratio of over 100:1. For comparison, the same representation without the mural technique is shown in Figure 1b.

(a)

(b)
Figure 1: (a) Mural of object-oriented message trace of over 50,000 messages, drawn in an area 500 pixels wide. (b) Same diagram drawn by just overplotting (without the mural technique).

One of the views from our prototype OO program visualization suite is called the Execution Mural (Figure 2). This view is used to examine message traces from object-oriented programs[JS95a]. The upper portion of the view is the focus area where a sub-set of the messages can be examined in detail. The bottom portion of the view is a navigational area that includes a mural of the entire message trace, and a navigation rectangle indicating where the focus area fits in the entire execution. Notice that the color of several different messages has been set in the focus area. The Information Mural technique allows the coloring of information attributes using shaded color scales, as is evident by the colored areas in the mural. The mural gives a quick insight into various phases in the execution, including very repetitive patterns. The coloring allows the location of particular messages in the execution to be identified.

Figure 2: Execution Mural view of bubble sort algorithm animation built using the Polka animation toolkit.

2.1.2 Parallel Processor Message Passing

Visualizations of the message passing during executions of programs on parallel architectures become very unclear when long durations of time are shown. The aliased mural of Figure 3a shows the kernel integer sort benchmark executing on 16 processors. Each processor is assigned a row on the vertical axis, and a message is drawn as a line from one processor to another at the appropriate time coordinates. This particular view uses wall clock timestamps. As is evident from the traditional representation shown if Figure 3b, the mural gives a much better resolution to the image.

(a)

(b)
Figure 3: (a) Mural of parallel program message trace. (b) Same diagram drawn by just overplotting (without the mural technique).

As was done in the Execution Mural view, a mural can be used in the background of a global overview to allow more detailed examination of the message passing. Figure 4 shows the same message trace, this time with messages colored according to message type. The global overview provided by the mural gives an immediate indication of the phases and sub-phases of the algorithm, as well as showing anomalies such as network blockage or processors waiting for others to complete.

Figure 4: View of message passing in kernel integer sort parallel processor benchmark, with focus area and global overview created using the Information Mural technique.

2.2 Data Visualization

The Information Mural technique is useful for revealing the underlying density of data while viewing very large data sets. Traditional plotting techniques typically overplot points that happen to lie in the same pixel. Our technique shows the actual density of the information. Incorporated into a data visualization, murals can support one- or two-dimensional navigation through large data spaces.

2.2.1 Sun Spots

Astronomers have been recording the number of sun spots since the 1700s. Because this is such a large dataset, it is typically plotted by showing the monthly averages. Figure 5 is a plot of the average number of sun spots per month recorded from 1850-1993. This data was obtained from the StatLib server at Carnegie Mellon University.

Figure 5: Plot of average number of sun spots recorded per month, 1850-1993.

Using the Information Mural technique, we do not have to worry about the size of the dataset. Figure 6 shows an anti-aliased mural of the number of sun spots recorded daily from 1850-1993, over 52,000 readings. Instead of using grayscale to depict density, a color scale which goes from dark blue (lowest data density) to bright white (highest data density) is used because it is easier to see outliers using color.

Figure 6: Mural of the number of sun spots recorded daily, 1850-1993.

Plotting statistics such as averages is commonly done to analyze large amounts of data. However, in the monthly view we do not see the band of "missing" values between zero and about 10, nor do we notice that a large number of zero values were recorded (bright spots at bottom of Figure 6).

With the stand-alone Information Mural views, it is also possible to incrementally zoom in on sections of the mural or to sweep out a rectangle to zoom. Figure 7 shows the sun spot mural zoomed in on a small area. Figure 8 shows how the mural of the entire data set can be placed in the background of a slider, giving context to a more detailed view of the data.

Figure 7: Mural of the number of sun spots recorded daily, 1850-1993, zoomed in on a small area.

Figure 8: View of sun spots showing focus area and mural of entire data set at the bottom.

2.2.2 River Flow Data

Another interesting large data set is the mean daily Saugeen river flows, from Jan 1, 1915 to Dec 31, 1979. The anti-aliased mural of this data shows a periodic pattern, with concentrations at the lower values. Some bright spots occur seemingly randomly in the lower portion of the mural shown in Figure 9a. Zooming in on a small area at the bottom, we find that the bright spots in the mural are due to single values that occur repetitively (Figure 9b). We hypothesize that these might be weeks or months in the data where a single value was extrapolated across the entire period to create the daily values.

(a)

(b)
Figure 9: (a) Mural of the mean daily river flow rates of the Saugeen river, 1915-1979. (b) Part (a) zoomed on small area at the bottom of the mural.

2.2.3 Automobile Data

The Information Mural technique can be used to create parallel coordinate data displays. A data set from the Committee on Statistical Graphics of the American Statistical Association (ASA) Second (1983) Exposition of Statistical Graphics Technology contains 406 observations on the following 8 variables: MPG (miles per gallon), number of cylinders, engine displacement (cu. inches), horsepower, vehicle weight (lbs.), time to accelerate from O to 60 mph (sec.), model year (modulo 100), and origin of car (1. American, 2. European, 3. Japanese). Figure 10a shows a parallel coordinate mural of a subset of the data, including MPG, displacement, horsepower, weight, acceleration, and model year. Part (b) of Figure 10 shows the standard parallel coordinate view without the mural. In Figure 10c, color has been overlaid on the mural according to the number of cylinders attribute. Notice how the data tuples with fewer cylinders tend to have higher MPG, smaller displacement, less horsepower, and longer acceleration times.

(a)

(b)

(c)
Figure 10: (a) Mural of a parallel coordinate view of automobile data showing MPG, engine displacement, horsepower, weight, acceleration, and model year (1970-1982). (b) Standard parallel coordinate view of the data. (c) Color overlaid for number of cylinders (3 = red, 4 = orange, 5 = yellow, 6 = green, 8 = cyan).

2.3 Information Visualization

Many other forms of information can be displayed using Information Murals. Two such applications, geographic data and text documents, are described below.

2.3.1 Geographic Information

The U.S. Census Bureau creates maps of various census statistics such as population distributions. While their techniques work well for wall-sized maps, the overwhelming scale reduction to display the information on a computer screen causes their algorithm to produce inaccurate results. The Information Mural technique computes information density automatically, making the display of a population density map on a computer screen almost trivial (Figure 11). The data was obtained from the Tiger Mapping Service U.S. Places File, created from the Census file STF-1A.

Figure 11: Mural of population density distribution, using data from the 1990 census.

2.3.2 Text Documents

While SeeSoft[ESJ92] from AT&T's Bell Laboratories introduced a revolutionary miniature representation for text documents, it did have a limit. One row of pixels (or part of a row in later versions) was required for every line in the file. The Information Mural technique can go beyond this limit, allowing many lines in a file to map to a single row of pixels in the miniature representation. On top of a grayscale mural representation of a document, color can be used to indicate attributes of the text, such as comments, sections, or keywords.

Figure 12 is a sample text editor with a mural in the background of the scrollbar. Color is used to indicate sections in the Latex document being browsed. The mural is constructed by examining the position of each character in the file, scaling that position into the scrollbar, and mapping the resulting density of characters to the intensity scale.

Figure 12: Text editor containing Latex document. Mural of the entire file is shown in the background of the scrollbar, with text shaded according to section.

Information Murals can also be used to visualize the distribution of keywords in a set of documents retrieved from a search. Figures 13a-c show the distribution of keywords in three papers after a search for visualization (yellow), object-oriented (green), and OO (cyan) was performed.

(a)(b)(c)

Figure 13: Murals showing keyword distribution for search on ``visualization'' (yellow), ``object-oriented'' (green), and ``OO'' (cyan) in three documents.

The document in Figure 13a seems to be about visualization, and talks a little about object-oriented stuff in the beginning. Figure 13b talks about both visualization and object-oriented throughout the document, and Figure 13c discusses object-oriented and visualization in the beginning and in the end.

3 Conclusion

An Information Mural is a 2D, graphical representation of a large information space which fits entirely within a display window or screen. The miniature representation is drawn using anti-aliasing techniques and intensity shading or varying pixel size, and is useful for visualizing trends and patterns in the overall distribution of information. By adding panning and zooming capabilities to an Information Mural, they can be used as stand-alone visualizations or as global views along with more detailed informational displays.

The techniques used to create Information Murals can be integrated into various information visualization applications to help display large information spaces. In browsing information or examining a large data set, it is always useful to start with a global overview of the information. Information Murals convey more information about large data sets than traditional techniques, and allow overviews of certain types of information spaces to be created when before they could not. Another advantage of the Information Mural technique is that the application need not concern itself with how much space is available to render the information--the density mapping is computed automatically based on the available space.

References

[ESJ92]
Stephen G. Eick, Joseph L. Steffen, and Eric E. Sumner Jr. SeeSoft---A tool for visualizing line oriented software statistics. IEEE Transactions on Software Engineering, 18(11):957--968, November 1992.
[JS95a]
Dean F. Jerding and John T. Stasko. The Information Mural: A technique for displaying and navigating large information spaces. In Proceedings of the IEEE Visualization `95 Symposium on Information Visualization, pages 43-50, Atlanta, GA, October 1995.
[JS95b]
Dean F. Jerding and John T. Stasko. Using Information Murals in visualization applications. In Proceedings of the 1995 Symposium on User Interface Software and Technology (Demonstration), pages 73-74, Pittsburgh, PA, November 1995.
[JSB97]
Dean F. Jerding, John Stasko and Thomas Ball. Visualizing Interactions in Program Executions. Proceedings of the 1997 International Conference on Software Engineering (ICSE-97), pages 360-370, Boston, MA, May 1997.
[JS98]
Dean F. Jerding and John Stasko. The Information Mural: A Technique for Displaying and Navigating Large Information Spaces IEEE Transactions on Visualization and Computer Graphics, 4(3):257-271, July-Sept. 1998.
[SK93]
John T. Stasko and Eileen Kraemer. A methodology for building application-specific visualizations of parallel programs. Journal of Parallel and Distributed Computing, 18(2):258--264, June 1993.

Home | Details