Imagine a neuroscientist browsing brain data. Her two questions, ``where am I?'' and ``what is this?'', require different support. When answering the former question (or when zoomed out) not all of the detail stored in the database is needed. This situation provided us an opportunity to improve the responsiveness of our interface. We looked at reducing image transmission time by trading initial image detail for speed and, later, as it was needed or there was time, adding more detail to the image.
In order to trade off image quality for image loading speed, we needed a representation of the data that could be used to transmit an image in such a way that the quality of the image could be incrementally improved. That is, we want to incrementally improve an image's resolution, and do so at the user's command. We considered the interleaved GIF format. The popular WWW browser, Netscape, uses interleaved GIF to show the user a coarse view of a picture that is then refined with time. However, GIF was insufficient for our purposes because it can only be used to store 256-color images, and we wanted more flexibility.
Unfortunately, all of the Tk-supported image formats lacked the flexibility to store 24-bit color images in a form that can be efficiently loaded incrementally. So, we started looking at general requirements for our image model: rapid display of a good image and refinements to the image with minimal database access. If we found a suitable representation, we would see how hard it would be to extend Tk to support it.
We quickly focused on wavelet-based compressed images. For our purposes here we describe the wavelet transform simply as taking images (pixel values) and producing collections of coefficients. (Wavelets are a much more complicated topic.) These coefficients are averages and differences from the average over some area. One wavelet coefficient will represent the average intensity of the whole image whereas others contain information about smaller portions of the image. A network packet of pixels is just a set of dots, while a network packet of wavelet coefficients would generate a fuzzy but complete version of the original image.
To prepare for incremental resolution we a) transform the image to produce coefficients, b) sort the wavelets coefficients from coarsest details to finest, and c) store them. When the viewer requests an improvement in the image, more coefficients are read sequentially from disk. Sorting in this fashion satisfies our requirement that large improvements in quality occur early in the transfer. Indeed, just a few percent of the data is normally sufficient for a good image.
Wavelets work particularly well for the neighborhood viewer application. Work with early prototypes identified that users transitioned between two modes: positioning and studying. In the positioning mode, users are most interested in coarse shapes to navigate the large number of images in the database. Once the neuroscientist has found a particular region of interest, he will study it in detail. He now focuses on a small section of the image and is interested in seeing substantial detail. Indeed, given the expense of acquiring images, he will not tolerate any loss at this level. (Another benefit of wavelet transforms is that they are not inherently lossy, which is a decisive advantage in this application over the discrete cosine transform used in JPEG.)
Our interface evolved to incorporate a new operation: add resolution. Navigation now involves a sequence of "move, add resolution" cycles as the user moves closer and closer to the point of interest.
Wavelets appear to be the right solution. They provide the model of compression we sought, one that supports rapid display of good pictures and incremental display to reach original detail. Unfortunately, wavelets involve extensive calculation, and a back-of-the envelope estimate showed that wavelet code in Tcl would never perform satisfactorily. Being ecologically-minded, we sought out a library that we could reuse, Michael Hilton's wavelet library in C, made a few modifications, and created a wavelet-compressed image object in Tcl. Our original plan was to create a new image format, but since the current API for image formats does not support incremental image quality, we developed a ``wavelet data'' object that interacted with the standard photo image.
The resulting iteration of the neighborhood
browser gave us a new performance trade-off. Wavelet-transformed images require less network transmission, but more computation. We find that the two versions are comparable on high-speed local networks, but that the wavelet model will become increasingly preferable as we consider transferring the images over slower, intercontinental networks. Furthermore, we are now poised to take advantage of any improvements in performance that become available for wavelet transforms (the ``lifting'' technique promises to double transform speed).
The lesson from this experience is that easily extendable systems make it possible to explore new technologies. Had we been developing in a system more rigid than Tcl/Tk, we probably would have abandoned the search for a better image format. Because of the clean interface for extending Tcl/Tk, we were able to integrate an existing wavelet library and explore more advanced interfaces.
We are currently working to extend our wavelet implementation in several ways. First, we are working on performance issues surrounding the inverse wavelet transform as these are the bottleneck for real-time applications. Second, we are examining techniques to handle images with widths and heights that are not powers of two. When we have addressed these issues, and defined a robust file format for wavelet data, we plan to make the wavelet extension available to others.
The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.