The Institute's research programme is based around interrelated focus areas of expertise, providing the underlying research context for specific projects.
Current projects
CLEF
COGENT
I-GUIDE
NECA
SEMANTIC MINING
TUNA
WYSIWYM
Recent Projects
AGILE
CLIME
CONCEDE
EUROMAP
GNOME
GREG
ICONOCLAST
METAPHON
PILLS
RAGS
SENSEVAL
WASPS
Current projects | |||
| CLEF home page |
CLEF is a 3 year project funded by the Medical Research Council, commencing October 2002.
CLEF aims to create a scalable, generic architecture for capture and management of clinical
and other descriptive data, integrated with genomic data and images and linked to literature
and web resources. The project consortium is lead by the University of Manchester Medical Informatics Group and brings
together teams from UCL, University of Sheffield, Royal Marsden Hospital and University of Cambridge.
At ITRI, we focus on providing natural language generation technologies
for assisting the creation and management of electronic patient records and for generating
summaries of clinical data.
| ||
| COGENT home page |
Natural Language Generation (NLG) technology has reached a level of
maturity where applied systems exist in a range of specialised
real-world domains (such as weather bulletins, software documentation,
health and legal advice and stock market movements). However,
developing such systems currently involves hand-crafting and
special-purpose tuning by NLG experts which is non-portable,
non-scaleable, time-consuming and expensive. Wider deployment of
language generation requires more generally applicable and reusable
NLG components based on wide-coverage grammars, but at present,
effective techniques for such wide-coverage generation are not well
understood. This three year project will investigate systematically
the characteristics of wide-coverage generation and to develop
reflective techniques for controlling it effectively. As well as
furthering our understanding of wide-coverage generation, the project
will deliver a substantial and novel resource to support future
research in this area, and practical implementations of wide-coverage
controllable generators.
| ||
| I-GUIDE home page |
i-Guide is a research project under the EPOCH Network of Excellence (www.epoch-net.org). EPOCH is funded by
the European Commission under the Sixth Framework Programme, project no.
507382. The aim of the i-Guide project is to increase the accessibility
to cultural heritage sites by providing a virtual replica with a guide
who presents information customised to the vistor's profile.
| ||
| NECA home page |
NECA is a 2.5 -year project funded by the European Commission under
their Information Society Technologies (IST) programme. The NECA
consortium brings the ITRI together with the Universities of Vienna and
Saarbruecken, with the German Research Institute for Artificial
Intelligence (DFKI), and with the companies Freeserve (UK) and Sysis
(Austria). The objective of the project is to develop a new, more
sophisticated generation of conversational agents: on-line beings which
are able to speak and act like humans. The project, which started
in October 2001, focuses on communication between animated characters
that exhibit credible personality traits and affective behaviour.
| ||
| SEMANTIC MINING home page |
Semantic Interoperability and Data Mining in Biomedicine is a Network of
Excellence (NoE) funded by the European Commission under Framework 6.
The general objective of the network is to bridge gaps in European
research infrastructure and to facilitate cross-fertilisation between
scientific disciplines such as computer science, system engineering and
medical/clinical research. The long-term goal of the network will be the
development of generic methods and tools supporting critical tasks in
medical and biomedical informatics, such as, data-mining, knowledge
discovery, knowledge representation, abstraction and indexing of
information, semantic-based information retrieval in a complex and
high-dimensional information space, and knowledge based adaptive systems
for provision of decision support for dissemination of evidence based
medicine. There are 26 work packages in this project. ITRI is involved
in a number of them, including the mobility programmes (WP6), the
workshop/tutorial on Natural Language Processing in the Biomedical
domain (WP13), the workshop/tutorial on Text Mining and Information
Retrieval (WP15), the research activity on multilingual medical
dictionary (WP20) and the research activity on Ontology Engineering
(WP21).
| ||
| TUNA home page |
TUNA is a research project funded by the UK's Engineering and Physical
Sciences Research Council (EPSRC). Natural Language Generation programs
generate text from an underlying Knowledge Base. It can be difficult to find a
mapping from the information in the Knowledge Base to the words in a sentence,
for example, when the Knowledge Base uses 'names' such as '#Jones083' that a
hearer/reader does not understand, or has concepts which do not have their own
names. (e.g., a specific tree or a chair). In all such cases, the program has
to "invent" a description that enables the reader to identify the referent.
Existing algorithms tend to focus on one particular class of referring
expressions, for example conjunctions of atomic or relational properties
(e.g., `the black dog', `the book on the table'). Our research is aimed at
designing and implementing a new algorithm generates appropriate descriptions
in a far greater variety of situations. The algorithm will be more
complete, and generate expressions that are more appropriate
because it will be based on empirical studies involving corpora and controlled
experiments. Thus the project combines (psycho)linguistic, computational and
logical challenges. The project is to start in the Summer of 2003 (to run for
3 years).
| ||
| WYSIWYM home page |
WYSIWYM, which stands for "What you see is What you Meant", is a
method for editing knowledge bases. Most KB editors require the user
to have specialist training -- either in knowledge representation or
the use of graphical editors. WYSIWYM editing is specially developed
to avoid this problem by exploiting automatic text generation so that
the user interacts with an ordinary natural language document rather
than a relatively unfamiliar diagram.
Recently completed projects | ||
| AGILE home page |
This project built on the Institute's existing DRAFTER
system, for automatically generating software manuals in multiple languages
without the need for translation. The new grant added three new languages
(Bulgarian, Czech and Russian) to the existing two (English and French). The
project involved the collaboration of the Bulgarian Academy of Science,
Charles University in Prague, the University of Saarbrucken, and the Russian
Research Institute for Artificial Intelligence.
| ||
| CLIME home page |
The CLIME (Computerised Legal Information Management and
Explanation) project is a three year EU Esprit Project. It started at the 1st of
January 1998. The aim of the project is the development of the CLIME system: a
generic software infrastructure for supporting users in consulting distributed
archives and repositories of regulatory and legal information. The project will
deliver a demonstrator in the area of maritime law. At ITRI, we focus on
developping natural language technology to facilitate the formulation of queries
to legal information systems and the generation of multilingual natural language
answers to legal queries. The consortium is led by
British Maritime Technology Ltd. (UK), and
involves the
University of Brighton (UK), the
University of Amsterdam (The
Netherlands), Bureau Veritas
(France) and TXT Ingegneria
Informatica (Italy). | ||
| CONCEDE home page |
This project is developing medium-sized electronic
dictionaries for six Central European languages (Bulgarian, Czech, Estonian,
Hungarian, Romanian and Slovene), drawing on recently developed standards for
dictionary encoding and a sister project which is currently developing parallel
aligned corpora for the six languages. This work also builds on related work in
this area carried out in the Institute in the SEAL project. Collaborators in
CONCEDE are XRCE, Grenoble; Vassar College, New York; Bulgarian Academy of
Science, Sofia; Charles University, Prague; University of Tartu, Estonia;
Hungarian Academy of Science, Budapest; Research Institute for Informatics,
Bucharest; and the Josef Stefan Institute, Slovenia.
| ||
| EUROMAP home page |
Euromap Language Technologies is an EC supported initiative dedicated
to promoting greater awareness and faster take-up of Human Language
Technologies within Europe. The project acts as a central resource
and marketing support unit providing information to all communities
involved in Human Language Technology from researchers and developers
to suppliers and users. The aim is to create communitites of interest
on a national and cross border basis.
| ||
| GNOME home page |
Choice of nominal expressions greatly affects the
comprehensibility of a text. For example, a text will be clumsier and
less felicitous if the chosen definite descriptions refer to knowledge
that the reader does not have, if the same proper name is used
repeatedly, or if pronouns are ambiguous or misleading. Systems for
automatically generating texts need to be guided in a principled way
to avoid such problems. In the GNOME (Generating Nominal Expressions)
project, we developed general purpose computational tools for
generating these kinds of expressions. The project also involved
carrying out psycholinguistic studies on how people produce and
understand nominal expressions. This was a joint project with the HCRC
at the University of Edinburgh, supported with funding from the EPSRC.
| ||
| GREG home page |
GREG (a Georgian, Russian, English, German Valency Lexicon for Natural
Language Processing) is an INTAS-Georgia Project with partners
University of Stuttgart (co-ordinators), University of Brighton,
Tbilisi State University and the Institute of Linguistics, Georgian
Academy of Sciences. The project will develop a multilingual valency
lexicon for the four languages, suitable for use in language
engineering applications. The lexicon will contain semantic and
syntactic specifications for 1000 Georgian verbs and their English,
Russian and German counterparts. Semantic valency will be described
with reference to thematic, or case-frame, roles, as described by
Fillmore, Halliday and others; syntactic valency, in terms of
subcategorisation frames. The resulting lexicon will be made
available in TEI-conformant SGML or XML. The project started on 1
March 1999.
| ||
| ICONOCLAST home page |
(Integrating CONstraints On LAyout and STyle) This project extended ongoing work on automatic text generation by including graphical aspects of text such as punctuation, layout and style within a constraint-based framework. The project was funded by EPSRC and involved two industrial partners: Information Design Unit and Multilingual Technology Ltd.
| ||
| RAGS home page |
Although the general problem of Natural Language
Generation (NLG) remains far from completely understood, the field is
starting to produce systems which achieve successful practical
deployment. However, a significant barrier to wider exploitation is
the lack of a standard view of the generation process as a whole,
within which more specialised research can be embedded and against
which whole systems can be compared. RAGS (Reference Architecture for
Generation Systems) aims to provide such a view: a `reference
architecture' for natural language generation systems based on an
emerging consensus on what such systems should be like. The work will refine this consensus into a more explicit reference architecture specification, identifying principal components and data representations. It will produce examples for each of the data interfaces and sample implementations of the processing modules. Although the resulting architecture may not be a perfect fit for all NLG applications, the intention is the it will be sufficiently general to facilitate sharing of resources and comparative evaluation of different approaches. This is a joint project with the department of AI at the University of Edinburgh, supported with funding from the EPSRC. | ||
| METAPHON home page |
METAPHON is a short (6 months) project funded by the ESRC to complete a small pilot study in defining metaphonemes - that is, cross-linguistic phoneme correspondences in related languages. A small study of the vowel phonemes in English, German, Dutch and Danish was undertaken previously and the current project is intended to do the same with the consonants of these languages. | ||
| PILLS home page |
This is a European preparatory action project, started 1 January 2001,
which brings together the University of
Brighton (UK), the University of Freiburg
(Germany) and Berlitz
GlobalNet (UK). Our aim is to facilitate the production of
healthcare information documents by developing a multilingual authoring
tool. During this one year project, we will assess the need and impact
of such a tool on the pharmaceutical market and develop a prototype that
demonstrates the feasibility of this enterprise.
| ||
| SENSEVAL home page |
This is a series of evaluation exercises for Word
Sense Disambiguation (WSD) systems. The project is co-ordinated at the
University of Brighton and is under the auspices of ACL-SIGLEX, (ACL Special
Interest Group on the Lexicon) and EURALEX (European Association for Lexicography). It brings together researchers from
three continents for focused analysis of current issues in WSD.
| ||
| WASPS home page |
WASPS was an EPSRC funded project, which explores the
synergy between the lexicographer's task of identifying and describing word
senses, and the computational task of word sense disambiguation (WSD). The
project was completed in June, 2002.
| ||
| DRAFTER home page |
(DRafting Assistant For TEchnical wRiters). This project aimed
to use multilingual generation technology to support the production of English
and French end-user manuals for software products. It was funded by the
EPSRC, and involved two industrial
partners: Integral Solutions Limited (ISL)
and Praetorius Ltd. The project was successfully completed in March 1997.
| ||
| GIST home page |
(Generating InStructional Text). This project aimed to
generate procedural administrative texts in English, German and Italian to
assist authors in public administration. It was funded by the European Union's
LRE Programme and was a collaboration between the ITRI and the Austrian
Institute for Artificial Intelligence in Vienna, the Institute for Research in
Science and Technology in Trento, Italy, the University of Madrid, Spain,
Quinary Spa. in Milan, Italy, and two Italian user groups: the Italian Pension
Service and the translation department of the trilingual province of Trentino,
Italy. The projects was successfully completed in August 1996.
| ||
| SEAL home page |
(Structural Enhancement of Automatically-acquired Lexicons).
This project looked at ways of using current lexical resources as the basis for
development of application-specific lexicons. This research was funded by the
EPSRC, and was successfully completed in
June 1998.
Maintained by the ITRI webmaster ( ). Last updated 23 November 2004 ©Information Technology Research Institute ITRI home page | ||