Research at the ITRI

The Institute's research programme is based around interrelated focus areas of expertise, providing the underlying research context for specific projects.

Current projects
CLEF
COGENT
I-GUIDE
NECA
SEMANTIC MINING
TUNA
WYSIWYM

Recent Projects
AGILE
CLIME
CONCEDE
EUROMAP
GNOME
GREG
ICONOCLAST
METAPHON
PILLS
RAGS
SENSEVAL
WASPS

Previous Projects


Current projects

CLEF

home page
CLEF is a 3 year project funded by the Medical Research Council, commencing October 2002. CLEF aims to create a scalable, generic architecture for capture and management of clinical and other descriptive data, integrated with genomic data and images and linked to literature and web resources. The project consortium is lead by the University of Manchester Medical Informatics Group and brings together teams from UCL, University of Sheffield, Royal Marsden Hospital and University of Cambridge. At ITRI, we focus on providing natural language generation technologies for assisting the creation and management of electronic patient records and for generating summaries of clinical data.

COGENT

home page
Natural Language Generation (NLG) technology has reached a level of maturity where applied systems exist in a range of specialised real-world domains (such as weather bulletins, software documentation, health and legal advice and stock market movements). However, developing such systems currently involves hand-crafting and special-purpose tuning by NLG experts which is non-portable, non-scaleable, time-consuming and expensive. Wider deployment of language generation requires more generally applicable and reusable NLG components based on wide-coverage grammars, but at present, effective techniques for such wide-coverage generation are not well understood. This three year project will investigate systematically the characteristics of wide-coverage generation and to develop reflective techniques for controlling it effectively. As well as furthering our understanding of wide-coverage generation, the project will deliver a substantial and novel resource to support future research in this area, and practical implementations of wide-coverage controllable generators.

I-GUIDE

home page
i-Guide is a research project under the EPOCH Network of Excellence (www.epoch-net.org). EPOCH is funded by the European Commission under the Sixth Framework Programme, project no. 507382. The aim of the i-Guide project is to increase the accessibility to cultural heritage sites by providing a virtual replica with a guide who presents information customised to the vistor's profile.

NECA

home page
NECA is a 2.5 -year project funded by the European Commission under their Information Society Technologies (IST) programme. The NECA consortium brings the ITRI together with the Universities of Vienna and Saarbruecken, with the German Research Institute for Artificial Intelligence (DFKI), and with the companies Freeserve (UK) and Sysis (Austria). The objective of the project is to develop a new, more sophisticated generation of conversational agents: on-line beings which are able to speak and act like humans. The project, which started in October 2001, focuses on communication between animated characters that exhibit credible personality traits and affective behaviour.

SEMANTIC MINING

home page
Semantic Interoperability and Data Mining in Biomedicine is a Network of Excellence (NoE) funded by the European Commission under Framework 6. The general objective of the network is to bridge gaps in European research infrastructure and to facilitate cross-fertilisation between scientific disciplines such as computer science, system engineering and medical/clinical research. The long-term goal of the network will be the development of generic methods and tools supporting critical tasks in medical and biomedical informatics, such as, data-mining, knowledge discovery, knowledge representation, abstraction and indexing of information, semantic-based information retrieval in a complex and high-dimensional information space, and knowledge based adaptive systems for provision of decision support for dissemination of evidence based medicine. There are 26 work packages in this project. ITRI is involved in a number of them, including the mobility programmes (WP6), the workshop/tutorial on Natural Language Processing in the Biomedical domain (WP13), the workshop/tutorial on Text Mining and Information Retrieval (WP15), the research activity on multilingual medical dictionary (WP20) and the research activity on Ontology Engineering (WP21).

TUNA

home page
TUNA is a research project funded by the UK's Engineering and Physical Sciences Research Council (EPSRC). Natural Language Generation programs generate text from an underlying Knowledge Base. It can be difficult to find a mapping from the information in the Knowledge Base to the words in a sentence, for example, when the Knowledge Base uses 'names' such as '#Jones083' that a hearer/reader does not understand, or has concepts which do not have their own names. (e.g., a specific tree or a chair). In all such cases, the program has to "invent" a description that enables the reader to identify the referent. Existing algorithms tend to focus on one particular class of referring expressions, for example conjunctions of atomic or relational properties (e.g., `the black dog', `the book on the table'). Our research is aimed at designing and implementing a new algorithm generates appropriate descriptions in a far greater variety of situations. The algorithm will be more complete, and generate expressions that are more appropriate because it will be based on empirical studies involving corpora and controlled experiments. Thus the project combines (psycho)linguistic, computational and logical challenges. The project is to start in the Summer of 2003 (to run for 3 years).

WYSIWYM

home page
WYSIWYM, which stands for "What you see is What you Meant", is a method for editing knowledge bases. Most KB editors require the user to have specialist training -- either in knowledge representation or the use of graphical editors. WYSIWYM editing is specially developed to avoid this problem by exploiting automatic text generation so that the user interacts with an ordinary natural language document rather than a relatively unfamiliar diagram.

Recently completed projects

AGILE

home page
This project built on the Institute's existing DRAFTER system, for automatically generating software manuals in multiple languages without the need for translation. The new grant added three new languages (Bulgarian, Czech and Russian) to the existing two (English and French). The project involved the collaboration of the Bulgarian Academy of Science, Charles University in Prague, the University of Saarbrucken, and the Russian Research Institute for Artificial Intelligence.

CLIME
home page

The CLIME (Computerised Legal Information Management and Explanation) project is a three year EU Esprit Project. It started at the 1st of January 1998. The aim of the project is the development of the CLIME system: a generic software infrastructure for supporting users in consulting distributed archives and repositories of regulatory and legal information. The project will deliver a demonstrator in the area of maritime law. At ITRI, we focus on developping natural language technology to facilitate the formulation of queries to legal information systems and the generation of multilingual natural language answers to legal queries. The consortium is led by British Maritime Technology Ltd. (UK), and involves the University of Brighton (UK), the University of Amsterdam  (The Netherlands), Bureau Veritas (France) and TXT Ingegneria Informatica (Italy).

CONCEDE

home page
This project is developing medium-sized electronic dictionaries for six Central European languages (Bulgarian, Czech, Estonian, Hungarian, Romanian and Slovene), drawing on recently developed standards for dictionary encoding and a sister project which is currently developing parallel aligned corpora for the six languages. This work also builds on related work in this area carried out in the Institute in the SEAL project. Collaborators in CONCEDE are XRCE, Grenoble; Vassar College, New York; Bulgarian Academy of Science, Sofia; Charles University, Prague; University of Tartu, Estonia; Hungarian Academy of Science, Budapest; Research Institute for Informatics, Bucharest; and the Josef Stefan Institute, Slovenia.

EUROMAP

home page
Euromap Language Technologies is an EC supported initiative dedicated to promoting greater awareness and faster take-up of Human Language Technologies within Europe. The project acts as a central resource and marketing support unit providing information to all communities involved in Human Language Technology from researchers and developers to suppliers and users. The aim is to create communitites of interest on a national and cross border basis.

GNOME

home page
Choice of nominal expressions greatly affects the comprehensibility of a text. For example, a text will be clumsier and less felicitous if the chosen definite descriptions refer to knowledge that the reader does not have, if the same proper name is used repeatedly, or if pronouns are ambiguous or misleading. Systems for automatically generating texts need to be guided in a principled way to avoid such problems. In the GNOME (Generating Nominal Expressions) project, we developed general purpose computational tools for generating these kinds of expressions. The project also involved carrying out psycholinguistic studies on how people produce and understand nominal expressions. This was a joint project with the HCRC at the University of Edinburgh, supported with funding from the EPSRC.

GREG

home page
GREG (a Georgian, Russian, English, German Valency Lexicon for Natural Language Processing) is an INTAS-Georgia Project with partners University of Stuttgart (co-ordinators), University of Brighton, Tbilisi State University and the Institute of Linguistics, Georgian Academy of Sciences. The project will develop a multilingual valency lexicon for the four languages, suitable for use in language engineering applications. The lexicon will contain semantic and syntactic specifications for 1000 Georgian verbs and their English, Russian and German counterparts. Semantic valency will be described with reference to thematic, or case-frame, roles, as described by Fillmore, Halliday and others; syntactic valency, in terms of subcategorisation frames. The resulting lexicon will be made available in TEI-conformant SGML or XML. The project started on 1 March 1999.

ICONOCLAST

home page
(Integrating CONstraints On LAyout and STyle) This project extended ongoing work on automatic text generation by including graphical aspects of text such as punctuation, layout and style within a constraint-based framework. The project was funded by EPSRC and involved two industrial partners: Information Design Unit and Multilingual Technology Ltd.

RAGS

home page
Although the general problem of Natural Language Generation (NLG) remains far from completely understood, the field is starting to produce systems which achieve successful practical deployment. However, a significant barrier to wider exploitation is the lack of a standard view of the generation process as a whole, within which more specialised research can be embedded and against which whole systems can be compared. RAGS (Reference Architecture for Generation Systems) aims to provide such a view: a `reference architecture' for natural language generation systems based on an emerging consensus on what such systems should be like.

The work will refine this consensus into a more explicit reference architecture specification, identifying principal components and data representations. It will produce examples for each of the data interfaces and sample implementations of the processing modules. Although the resulting architecture may not be a perfect fit for all NLG applications, the intention is the it will be sufficiently general to facilitate sharing of resources and comparative evaluation of different approaches. This is a joint project with the department of AI at the University of Edinburgh, supported with funding from the EPSRC.

METAPHON

home page
METAPHON is a short (6 months) project funded by the ESRC to complete a small pilot study in defining metaphonemes - that is, cross-linguistic phoneme correspondences in related languages. A small study of the vowel phonemes in English, German, Dutch and Danish was undertaken previously and the current project is intended to do the same with the consonants of these languages.
PILLS

home page
This is a European preparatory action project, started 1 January 2001, which brings together the University of Brighton (UK), the University of Freiburg (Germany) and Berlitz GlobalNet (UK). Our aim is to facilitate the production of healthcare information documents by developing a multilingual authoring tool. During this one year project, we will assess the need and impact of such a tool on the pharmaceutical market and develop a prototype that demonstrates the feasibility of this enterprise.

SENSEVAL

home page
This is a series of evaluation exercises for Word Sense Disambiguation (WSD) systems. The project is co-ordinated at the University of Brighton and is under the auspices of ACL-SIGLEX, (ACL Special Interest Group on the Lexicon) and EURALEX (European Association for Lexicography). It brings together researchers from three continents for focused analysis of current issues in WSD.

WASPS

home page
WASPS was an EPSRC funded project, which explores the synergy between the lexicographer's task of identifying and describing word senses, and the computational task of word sense disambiguation (WSD). The project was completed in June, 2002.

Previous projects
DRAFTER
GIST
SEAL



DRAFTER

home page
(DRafting Assistant For TEchnical wRiters). This project aimed to use multilingual generation technology to support the production of English and French end-user manuals for software products. It was funded by the EPSRC, and involved two industrial partners: Integral Solutions Limited (ISL) and Praetorius Ltd. The project was successfully completed in March 1997.

GIST

home page
(Generating InStructional Text). This project aimed to generate procedural administrative texts in English, German and Italian to assist authors in public administration. It was funded by the European Union's LRE Programme and was a collaboration between the ITRI and the Austrian Institute for Artificial Intelligence in Vienna, the Institute for Research in Science and Technology in Trento, Italy, the University of Madrid, Spain, Quinary Spa. in Milan, Italy, and two Italian user groups: the Italian Pension Service and the translation department of the trilingual province of Trentino, Italy. The projects was successfully completed in August 1996.

SEAL

home page
(Structural Enhancement of Automatically-acquired Lexicons). This project looked at ways of using current lexical resources as the basis for development of application-specific lexicons. This research was funded by the EPSRC, and was successfully completed in June 1998.


Maintained by the ITRI webmaster ( ).
Last updated 23 November 2004

©Information Technology Research Institute

ITRI home page