AAAI Symposium: Using Layout for the Generation, Understanding or Retrieval of Documents (Cape Cod, 5-7 November 1999)

The symposium brought together 30 people interested in layout from various points of view. Four main themes were addressed:

The opening talk was given by an invited speaker, the psychologist Pat Wright (Cardiff University, UK), a leading expert on the ways in which document design can affect readability. Through examples, Pat Wright surveyed the significance of layout from the reader's point of view. She pointed out that even without reading the words, a reader can often identify the genre of a document from its graphical design. On taking a closer look, the reader may find visual aids to navigation which save time and effort; this factor might be crucial in determining whether the reader proceeds to read sections of the document, or gives up straight away. During detailed reading, layout influences the reader's assumptions about the role of each part of the text, grouping items together, highlighting, or marking items with a special purpose such as headings and captions. These crucial signals can be lost when a document is transformed from one medium to another, e.g. when a formatted article is transmitted as a text file.

Throughout the symposium, our discussion was favoured by the presence of Rob Waller (Information Design Unit, UK) of the Information Design Unit, who combines experience in the art of document design with a long-standing theoretical interest. In his presentation, Rob Waller illustrated some reasons why document design so often fails, including a lack of reader feedback, poor command of the "toolbox" available to designers, and genre dissonance due to insufficient cultural awareness. Both Rob Waller and Pat Wright stressed during discussion that layout conveys meanings beyond those directly related to the text, just as tone of voice in conversation can augment --- or even negate --- the literal meaning of what is said.

Several speakers were interested in the relationship between layout and rhetorical organization. Christophe Luc and Marie-Paule Pery-Woodley (IRIT, France) analysed an example in which a complex rhetorical structure is expressed partly through a layout device (enumerated list) and partly through linguistic "discourse markers" (words like "since" and "however"). Judy Delin (University of Stirling, UK) showed an example of rhetorical relations across modalities, analysing in particular the communicative function of illustrations in a bird book; Nancy Green (University of North Carolina, USA) also addressed cross-modal relationships in a system that generates text integrated with graphics (tables, bar-charts etc.).

Another intersection of interests was the question of mark-up: how to mark up texts for layout, and how existing mark-up in HTML or XML could be exploited for automatic formatting or for information extraction. James Curran (University of Sydney, Australia) described an impressive system that learns rules for deriving an XML content structure from a source marked up in HTML; Alexander Kroener (DFKI, Germany) and Isobel Cruz (Worcester Polytechnic Institute, USA), addressed problems of automatic formatting. There was general interest in the project of building corpora marked up for layout: Nadjet Bouayad-Agha (University of Brighton, UK) described some problems in marking up a corpus of Patient Information Leaflets, and Matthew Hurst (University of Edinburgh, UK) discussed the special difficulties posed by documents that include tables. Thomas Kieninger (DFKI GMBH, Germany) presented an interesting system that automatically assigns tabular structure to a source document scanned in through Optical Character Recognition, or alternatively to a source laid out in ASCII with no mark-up.

All presentations at the symposium, including the invited talk, will be preserved in an AAAI Technical Report to be published in 2000. It was also decided that a web-site would be set up, giving contact details as well as links to publications and other resources concering layout; our thanks are due to John Willmore (BIZINT Solutions, USA), who offered to host this site, and to Matthew Hurst, who will maintain it.

Donia Scott and Richard Power (co-chairs)
ITRI, University of Brighton, UK