Spreadsheets have proven to be highly successful tools for interacting with numerical data, such as applying algebraic operations, manipulating rows or columns, and exploring ``what-if'' scenarios. Spreadsheet techniques have recently been extended from numeric domains to other domains [15, 10]. One possible further application of the spreadsheet is to the domain of information visualization, which often contains datasets that are large, abstract, and multi-dimensional. In addition, datasets can frequently be visually represented in multiple ways. In this paper we present a spreadsheet approach to the display and exploration of information visualizations. We discuss how spreadsheet techniques provide a structured, intuitive, and powerful interface for investigating information visualizations of multidimensional datasets.
Information visualization systems confront such questions as how to represent abstract data visually, what types of exploratory interaction to include, and how to structure this interaction. Therefore, certain capabilities are critical, such as exploring different views of the data interactively, applying operations like rotation or data filtering to a view or group of views, and comparing two or more related datasets. These operations are natural in a spreadsheet environment. The value of a visualization spreadsheet is in enabling scientists to build multiple visual representations of one or several datasets, perform operations on the visual representations together or separately, and compare and contrast them visually.
Spreadsheets offer two key benefits to users of information visualization systems. These benefits derive from the way spreadsheets span a range of application interactive alternatives. On the one hand, spreadsheets are of direct value to end-users, because the direct manipulation interface makes it easy to view, navigate, and interact with the data. This style of interaction is common in ad hoc spreadsheets that a user creates for a specific goal, such as to compare the lifetime cost of several alternatives for a computer purchase. Columns can be created for each component of the lifetime cost--such as hardware, software, and maintenance--and rows can be created for each type of computer. The resulting spreadsheet makes it easy to explore the alternatives, but requires little work beyond the minimum required to perform the calculations.
On the other hand, spreadsheets provide a flexible and easy-to-learn programming environment. Spreadsheet developers create templates that enable end-users to reliably repeat often-needed computations without the effort of development or coding. The success of spreadsheet-based structured interaction eliminates many of the stumbling blocks in traditional programming environments. For example, the developer does not have to worry about the data dependencies between datasets, nor do users worry about memory management. Behind the scene, these idiosyncrasies of programming are taken care of automatically.
In this paper, we show how to use the spreadsheet paradigm to ameliorate the following two issues. Within an information visualization spreadsheet, much of the user interaction is the application of operations, such as comparison, filtering, and animation. Within each of these operations, there are sub-categories. For example, the user can perform comparisons by looking at the visual representations of the data directly, or by computing the difference between datasets. The required user support for these two types of comparisons is quite different. Visual comparisons require layout strategies that enable views of the data laid out side-by-side, whereas comparisons at the data level require difference operators to be defined at the data domain level.
In information visualization, another large problem involving user-system interactions is the exploration of different methods for representing data. For a given data type, the perfect representation has not yet been discovered, or a ideal representation does not exist; instead, what is required is several different representation methods, because each method extracts different features out of the data. For a given data type there are several different representations available at the user's disposal. Without some tools to help users to explore this representation space, users are hopelessly lost.
What is exciting about the spreadsheet approach is that it enables information visualizers to solve both problems in a single environment. On the one hand, it facilitates the easy application of operations, such as direct manipulations of the visualizations, and the entry of formulas that specify relationships between cells. On the other hand, it also supports the exploration of different visual representation techniques by emphasizing the operands rather than the operators. Unlike traditional data-flow programming environments where operands are invisible within the flow, a spreadsheet environment emphasizes operands by displaying them in cells and instead hides the operators.
The rest of the paper is structured as follows. In Section 2, we describe past research related to spreadsheets. Section 3 describes the prototypes we have built and the design constraints. Section 4 briefly describes the example domains and prototypes we have chosen. Section 5 illustrates the principles behind the utility of a visualization spreadsheet. Finally, we present some concluding remarks.