SUMMARY: The project uses a high-speed, large-memory computer, together with plenty of storage, for the manipulation of large composited images (up to about A3 size, or 1.6 million pixels) and their serving across the World Wide Web in response to requests from an existing server (http://rubens.anu.edu.au). The project, focussed on the great Buddhist stupa monument of Borobudur - and called Jigsaw because it deals with image segments just like a jigsaw - will for the first time offer a seamless transition between 3D vector graphics (in the first case a model of Borobudur) and the bit-mapped graphics of the 3000+ images of the monument. Other spaces to be mapped will be the ANU campus, and specific houses at Pompeii, in the form of virtual tours.
BACKGROUND TO THE PROJECT: Digital images are increasingly important in Web work because of faster networks, and better and faster digitizing and storage technologies. In Art History teaching and research, for example, they are often searched via databases, and displayed in browsers. The higher the resolution of the image, the better quality it is, and the more information it contains. It is a common argument that digitising should, wherever possible, be done only once, and at the highest (feasible) resolution. Images can then be archived full-size, and cut down to lower resolutions for current use. The resolution of digital images depends on the characteristics of the capturing device, and very high quality is now possible, for use in specialised viewing tasks (restoration, analysis, transcription of documents), and for printing to achieve near-photographic quality (e.g. for books or posters). The best digital cameras now offer six megapixels (i.e. up to poster-size), which is now within sight of the resolution of traditional film.
Web browsers are dependent, however, upon the characteristics not only of the machines on which they reside, but also on the vagaries of networks, with the result that images are either of low quality or impossibly lengthy to load. How might Web technologies, which are so dependent upon the speed of networks, conveniently be used to examine, retrieve and manipulate large images? Project Jigsaw aims to provide an innovative answer.
JIGSAW: In a jigsaw the image, which started as a complete picture, has been cut up into discrete sections, and is put together by the user. This project aims to do the same with digital images, in two ways: composition makes a completely new composite image of discrete individual files (such as sections of slides, digitised as individual images); conversely, decomposition takes whole, large images, and break them down into sections for manipulation and storage. cf. Figure 2 (attached). Such flexibility should allow work both with video cameras and with high-resolution scanners. Thus the Sony DXP950 (1.6 megapixels: purchased with funds provided by a 1994 Large Equipment bid) can focus close enough to do "air-reconnaissance" overlay images of a 35mm or 6x6 slide or negative, whilst a scanner would produce one large image. For high-resolution shots "in the field", the Hasselblad/Kodak digital camera setup offers 2036x3060 images written to a PCMCIA flash card: these images can then be "decomposed" for storage and manipulation, whilst film negatives or transparencies taken with the same camera but without the digital back are treated to the "air reconnaissance" method described above.
The procedures for images produced with DXP950 and scanner, or with the Hasselblad/Kodak combination, are symmetrical: the former need composition, the latter decomposition. Similar software routines should be applicable therefore for both.
MODELLING THE WORLD IN THREE DIMENSIONS: The equipment
is used to model the world from "photographic"
images, hung on a computer-modelled armature. The development of the necessary
techniques will focus on the Buddhist stupa at Borobudur, because this
is the most difficult and intricate exercise; and houses at Pompeii (British
School at Rome), Permanent Sample Sites (Forestry) and other virtual spaces
will then be built.
Jigsaw will use a "mixed" system, where a 3D
model is the armature on which to position low-resolution versions of the
high-quality images of the reliefs and other features of the monument.
Such modelling will be helped by the regular layout and pyramidal shape
of the stupa. Each polygom forming the 3D model will be mapped directly
to http URLs of the artwork images, so that clicking on a specific area
of the model will bring up the relief or statue which is located there.
Hence what the user sees (through the "window" in the Web browser
- essentially the viewport on the images) is photographic (i.e. in the
form of bitmaps, not vectors), because the armature will be hidden by the
representations of photographs which clothe it. This means that, whereas the model of the stupa really
is three-dimensional, it is still only the "map-reference"
for viewing the actual artworks. In effect, then, Jigsaw consists of a
series of two-dimensional objects in front of which the viewer can manoeuvre
in three-dimensional space. To prepare images for the project, we can proceed in either
of two directions, namely by composition or by decomposition
(cf. attached diagram): ADVANTAGES OF JIGSAW: The ability to view large
images (up to 6 megapixels, or about poster size) across the Web has not
yet been attempted, let alone the notion of "linked" images.
The techniques we are developing will be applicable in many disciplines,
because all the technologies used are generalised; examples include: MANIPULATING PHOTOGRAPHIC DATA WITH ZOOM:
The photographic data should be viewable from a distance, and from close
up; and, just as the human eye can wander left, right, up and down, so
the "window" in the Web browser should allow the user to do likewise.
The initial transition from clickable polygon on the 3D model, to bitmapped
image of a relief, has already been explained. But how does one then manipulate
such large bitmapped images over the Web? There are certainly difficulties.
Theuser with a Web browser could be at the other side of the world, and
suffering from a slow network, or a modem. To view images of thumbnail
size is easy, because these have been standardised on http://rubens.anu.edu.au
at about 12Kb, so load quickly. Similarly, video-resolution images generated
with standard video technology (i.e. about 760 by 625 pixels in PAL) load
relatively quickly. Zoom, a demo version of which is available at http://vandyck.anu.edu.au/david,
is a program allowing the user to view in a Web browser a small version
of a large image. A detail can be selected by clicking on a notional rectangle
first top-left, and the bottom-right, whereupon the software fetches and
displays the selected detail at something approaching the detail's full
size (depending, of course, on the dimensions of the detail selected).
The user may then download the selected detail, or order the whole image
to be converted to one of a variety of formats, and sent by email or stored
on the server's anonymous ftp site for collection within 24 hours. This
setup works well with images of any resolution although, with larger
images, further steps are required to produce smooth transitions which
buffer what the viewer sees in the Web browser from the potentially very
large image files on disk. This is where Jigsaw comes in: the beauty of the project
is the ability to move seamlessly from the manipulable model (vectors)
via examination of thumbnail-size images of reliefs, to close examination
of high-quality photographic (bitmap) images. cf Figure 3 (attached).
TECHNICAL DESCRIPTION: SOFTWARE & MANAGEMENT:
Programs will be needed to chop up and stitch together the images (written
in-house), to translate between formats (public domain: netpbm tools),
and to provide facilities for manipulation, viewing and retrieval over
the Web; this module, called zoom, is in progress (see above). The "visible" side of the project begins with
a 3D representation of the Borobudur stupa (or a Pompeian house, or a Permanent
Sample site, or other "spaces"), clothed in texture-mapped polygons.
Each of these polygons is a "marker" for an individual relief,
held on disk at high resolution, and mapped onto each polygon. The quality
of the product will depend upon the ability of the user manipulating the
Web browser to view the Borobudur reliefs at any distance and angle, and
for the images to be dynamically resized whilst maintaining image quality.
Manipulating the large images involved requires plenty
of processing power. The smallest images dealt with would be those taken
with the Sony DXP950 Video Camera which, using a half-pixel shift to its
three CCDs, writes data to its custom frame buffer which, after vertical
interpolation, are converted from ppm into JPEGs of about 500Kb each. We
already have over 4,000 of these. Much higher quality "in-the-field"
images are possible with the Hasselblad/Kodak combination, offering over
four times the resolution of the Sony-produced images. Again, as with 35mm
slides, any 6x6 negatives or slides photographed with the Hasselblad would
therefore be photographed in each of its four quadrants, giving approximately
1.6 megapixels for each quadrant - hence about 6 megapixels. These are
the images ready for composition when needed, as described above.