April 29, 2004

DigiHumanAssignment: Real Versus Unreal

Since the class's visit to MITH and our introduction to Paley's TextArc, I was and currently am struck by the "neat" factor of the site's/software's/viewer's visualization of a text and Dr. Susan Schreibman's description of TextArc as a tool that "deforms the text to understand it in a different way."

Building on Schreibman's presentation of MITH's work, our class this scratched the surface of what it means to work in the digital humanities, to interface with the digital humanities, and perhaps most importantly to do digital humanities. I am interested in the intersection of the problems and limitations of studying/producing analog humanities and the problems and limitations of studying/producing digital humanities. I think Schreibman (cum McGann cum Unsworth) is right that a big bump in the information superhighway is the retrieval of data, of information, and by extension the front end, user end of retrieval.

Considering the five Blake sites, I've picked two poles on the continuum of digital texts: Lessing J. Rosenwald Collection's Songs of Innocence and of Experience and TextArc's Songs of Innocence and of Experience.

On the one hand, the Rosenwald's digitization of Blake's Songs is extremely simple offering two possible versions (1794 and 1826), simple to navigate (a quick jump to page button or very clear PREV IMAGE or NEXT IMAGE clickable links), and a simple no frills, single window layout. The strength of the Rosenwald digitization is its photorealism. Each page, each plate (including the outer cover and end papers) is a high-quality, high-definition color scan of the book itself. The text offers both JPEG and TIFF resolutions. The advantage of the photographic plate is the preservation of the richness of the illustrations, the subtleties of hue, and the materiality of the artifact (e.g. you can see the grain of the paper, signs of wear, brush strokes). It is in photorealism that composition is preserved, that the interconnectivity between writing and illustration is preserved, and that the Blake as "author and printer" is preserved.

The Rosenwald digitization assumes the stance that the most important data to fore and preserve is the Songs as book, as codex, as object. The scans are like specimen photos. Each scan preserves the bookness of the text. Each scan is of facing verso and recto pages. The image is not cropped. You can still see the edges of the cover, the separate pages, the central binding. The digital book is meant to be flipped through like a paper book. Here the digitization is literal and conservative rather than interpretive or transformational. Though the reader can see what the book looks like that is the extent of the depth of the digital text. Here retrieval is flat. You cannot search the text (or the images). There is no additional annotation, added material, or useful hypertextuality.

On the other hand, TextArc's digitization of the Songs offers a stratling different presentation of the text. By the site's own admission, TextArc "is a visual represention of a text—the entire text (twice!) on a single page. A funny combination of an index, concordance, and summary; it uses the viewer's eye to help uncover meaning." The codex as physical artifact is completely dispensed with. The text is stripped, miniaturized, flung into a visually satisfying swash while individual words are arranged like a nebula of importance and iterations. The usual text is gone and what is left is a visualization that gives the reader a new perspective, an artificial entrance into the text. It is in the unnatural way TextArc presents and manipulates the text that is its strength allowing the reader to notice how different words are used, how they are spread through the text, how they are connected, and how important they are to the overall text. It takes the trusty concordance to a new level.

However, TextArc treats texts as just that -- raw, unformatted, pageless text. If a text contains images, they are rendered unimportant. Composition of the page becomes meaningless. Changes and inflections in typography (or handwriting) becomes flattened. In this case, Blake's Songs lose half or perhaps more of its impact as a written text and a visual one. And though TextArc deforms the text, gives the reader a unusual interface, the text can still be read very traditionally (from beginning to end, sequentially, linearly). There is little searchability. Furthermore, the interface is not wholly intuitive, navigation is menu-driven, and the reader is left with a "what do I do with this now?" feeling after the novelty of the text wears off. TextArc serves as a different lens to see and close-read a text with but does not function well if the readers needs are more codex-bound.

Finally, both the Rosenwald Collection and TextArc have minimum computer hardware and software requirements. The Rosenwald site contains large image files (particularly the TIFFs or the PDFs). TextArc requires a Java enabled browser, a fast connection, and a speedy computer otherwise the application is slow to load and slow to respond. Technological limitations on both the server and user sides continue to be a material issue affecting what can be digitized, what can be collected and "housed" online, and ultimately what can be distributed, seen, and manipulated.

In the end, digital texts must take the best of both worlds and offer a blending of the photoreal and the eye-opening unreal. I suppose sites like the William Blake Archive fall somewhere in between Rosenwald and TextArc. Or perhaps each variant, each push and pull on the digitized text needs to be embraced as part of the diversity online. Each has something to offer. I suppose then the ultimate digital text would be a metatext allowing a viewer/reader/user to see and manipulate all of these versions simultaneously (at least more effectively and efficiently than we do now). To reiterate McGann, "an edition is 'hyper' exactly because its structure is such that it seeks to preserve the authority of all the units that comprise its documentary arrays. In this respect a hyperedition resembles that fabulous circle whose center is everywhere and whose circumference is nowhere."

Posted by Ed at April 29, 2004 11:41 PM
Comments

TextArc, when used for Blake, also raises the question "what, exactly, constitutes the text?" Considering that the poems were originally presented, not only *with* illustrations, but *inextricable from* illustration, I question whether anything but a photorealistic representation would actually be a representation of "the text." I think TextArc is definitely valuable, even for Blake, but I'm not sure that what's being outputted is actually an analysis of "Songs of Innocence and Experience."

Posted by: Jess at May 1, 2004 02:13 PM

Ed, rich and provocative as always. One variable in all this that I've written about is the computational distinction between text and image--ironically, the striking visual displays that Textarc yields is a function of the character-based manipulation made possible by the data's unequivocal status as text.

What would visual deformation look like? McGann offers one example here:

http://jefferson.village.virginia.edu/%7Ejjm2f/chum.html

Posted by: Matt at May 9, 2004 06:32 PM