PREVIOUS NEXT

Curriculum Vitae - Roger Evans

R E S E A R C H P R O J E C T S

As principal investigator or manager

DATR
1987-present
DATR: a language for lexical representation.
My role: co-developer, with Gerald Gazdar
Further information: http://www.datr.org

Since 1987, I have been involved with research in collaboration with Gerald Gazdar on the problems of knowledge representation in a natural language lexicon. This research has resulted in the language, DATR, which is a formal non-monotonic logic for the concise specification of lexical information. DATR has attracted considerable attention in the Computational Linguistics community and has been adopted by several major groups across Europe. My own early research focused on the formal foundations of the language and explorations of its scope for linguistically interesting descriptions. More recently I have been using it as a vehicle for the study of large scale lexical phenomena, such as lexical rules and multilingual representation, invesitgating ways to embed DATR inheritance semantics into XML documents, and trying to embed probabilistic-style information into the DATR language.

This work is not funded directly by external grants, but DATR has been used, and to some extent developed, in the following research projects: GREG, Text Mining Demonstrator, CLIME, MUC5, POETIC, SPL.

OpenPoplog
2003-present
Development of the OpenPoplog system.
My role: co-developer
Further information: http://www.openpoplog.org

In addition to my main research interest in Computational Linguistics, I have a secondary interest in software tools and programming environments. Throughout my time at Sussex, I have had a keen interest in the Poplog programming system, an advanced multi-language environment developed within Cognitive and Computing Sciences at Sussex. Since 2003, I have been participating with other Poplog users have been collaborating on an unfunded basis towards the development of OpenPoplog a freeware release of the Poplog system to support on-going community-based use and development.

COGENT
2003-2006
Controlled Generation of Text
Funding: EPSRC project GR/S24480/01
Budget: £208,434
Timescale: August 2003, for thirty six months
Brighton role: Joint participant (with Sussex)
My role: Principal investigator (with K. van Deemter, D.Weir and J. Carroll)
Further information: http://www.brighton.ac.uk/nltg/projects/cogent

Natural Language Generation (NLG) technology has reached a level of maturity where applied systems exist in a range of specialised real-world domains (such as weather bulletins, software documentation, health and legal advice and stock market movements). However, developing such systems currently involves hand-crafting and special-purpose tuning by NLG experts which is non-portable, non-scaleable, time-consuming and expensive. Wider deployment of language generation requires more generally applicable and reusable NLG components based on wide-coverage grammars, but at present, effective techniques for such wide-coverage generation are not well understood. This three year project is investigating systematically the characteristics of wide-coverage generation and developing reflective techniques for controlling it effectively. As well as furthering our understanding of wide-coverage generation, the project will deliver a substantial and novel resource to support future research in this area, and practical implementations of wide-coverage controllable generators.

EUROMAP
2001-2003
EUROMAP/HOPE 2001
Funding: EU IST initiative, project IST-2000-28132
Budget: 112,379 EURO
Timescale: October 2001, for seventeen months
Brighton role: UK National Focal Point
My role: Site coordinator
Further information: http://www.itri.bton.ac.uk/euromap

EUROMAP Language Technologies was a European Commission supported initiative dedicated to promoting greater awareness and faster take-up of Human Language Technologies (HMLT) within Europe. HOPE 2001 was second phase of EU funding to support the addition of more countries into the EUROMAP consortium. ITRI was the UK partner funded by this second phase of funding. Our role was to establish a national presence (primarily via a website) for EUROMAP, a central reference point for HLT providers, users and funders to keep up to date and exchange news.

My role in the project was overall management, strategic planning and final approval of published information etc.

At the end of the EUROMAP project, the UK-specific resources were used as the basis of the CLUK Industrial liaison website for the UK special-interest group for Computational Linguistics (CLUK).

MATS
2000-2002
Manual Tagging for SENSEVAL
Funding: EPSRC
Budget: £15,341
Timescale: August 2000, for twenty-four months
Brighton role: coordinating participant
My role: principal investigator (with Adam Kilgarriff)
Further information: http://www.itri.bton.ac.uk/events/senseval

This project provided lexicography support for the second SENSEVAL workshop - an international comparison and evaluation of word sense disambiguation systems which took place in Summer 2001. SENSEVAL-2 extends the first SENSEVAL to a wider range of languages, and more rigorous task and evaluation data definition. My role in the project was largely management support: Adam Kilgarriff was the primary author of the proposal and de facto project manager.

WASPS
1999-2002
A semi-automatic lexicographer's workbench for writing word sense profiles.
Funding: EPSRC
Budget: £287,207
Timescale: March 1999, for three years
Brighton role: sole participant
My role: principal investigator (with Adam Kilgarriff)
Further information: http://www.itri.bton.ac.uk/projects/wasps

This project brought together recent developments in data-driven algorithms for word sense disambiguation with corpus-based tools devloped to support lexicography to develop the WASPBENCH, an integrated environment for lexicography and word sense definition. The primary outputs of the WASPBENCH are both human-readable characterisations of the word senses and the data required for accurate word sense disambiguation. The ideas were tested through the production of lexical entries for a substantial sample of the English lexicon, and also the development of multilingual resources for use in machine translation.

This project was managed jointly by myself and Adam Kilgarriff - as well as overall management and supervision of the project, I had specific responsibility for the evaluation workpackage of the project.

GREG
1999-2001
A Georgian, English, Russian and German multilingual valency lexicon for natural language processing.
Funding: EU INCO-Georgia
Budget: 5,000 EURO
Timescale: March 1999, for two years
Brighton role: participant
My role: co-investigator (with Adam Kilgarriff)
Further information: http://www.itri.bton.ac.uk/projects/greg

This project developed a multilingual lexicon, suitable for use with Language Engineering applications. It contains syntactic and semantic valency specifications for 1000 Georgian verbs and their Russian, English and German counterparts, with semantic valency described with reference to thematic roles as introduced by Fillmore and Halliday, and syntactic valency in terms of subcategorisation frames.

Brighton's role in the project was primarly to provide advice and technology transfer in lexical representation. As part of this, I developed a substantial new multilingual lexicon representation framework for use in the project, written entirely in DATR, and designed to give a high degree of flexibility in representation, but also to be useable by relatively novice lexicon developers.

This project was a collaboration with the University of Stuttgart (coordinator), Tblisi State Univseristy and the Georgian Academy of Sciences.

CLIME
1998-2001
Computerised legal information management and explanation.
Funding: EU ESPRIT initiative
Budget: 2,000,000 EURO, ITRI share 426,000 EURO
Timescale: January 1998, for three years
Brighton role: partner
My role: principal investigator for Brighton
Further information: http://www.bmtech.co.uk/clime/index.html

This project developed software to support access to legal and regulatory information, specifically in the area of maritime law. The principal deliverable of the project was a multilingual web-based application that advised in the applicability of maritime regulations to a particular ship scenario (for example, whether it is allowable to run the only fire pump from the ship's main engine, or whether a ship can carry oil in its ballast tanks).

ITRI's main role in the project was the development of the natural language interfaces: the query input interface using WYSWIYM technology, and the generation of natural language answers from the system's internal answer format (in English and French). query. My own role was to provide day-to-day management and research leadership of the local team, liaison and strategic development within the whole consortium, architectural design and implementation and application delivery. At the technical level, one particular achievement was the development of a tightly coupled prolog/java interface by linking the java virtual machine into the (poplog) prolog system.

The consortium was led by British Maritime Technology Ltd., and involved the University of Brighton, the University of Amsterdam, Bureau Veritas (France) and TXT Ingeneria Informatica (Italy).

CONCEDE
1998-2000
Consortium for Central European Dictionary Encoding
Funding: EU INCO-COPERNICUS initiative, project 96/1142
Budget: 240,000 EURO, ITRI share 51,000 EURO
Timescale: January 1998, for thirty months
Brighton role: lead partner
My role: principal investigator for Brighton, and overall project manager
Further information: http://www.itri.bton.ac.uk/projects/concede

This project developed medium-sized (1000-5000 word) electronic dictionaries for six Central European languages (Bulgarian, Czech, Estonian, Hungarian, Romanian and Slovene), drawing on recently developed standards for dictionary encoding and a sister project which is currently developing parallel aligned corpora for the six languages.

ITRI coordinated the project and provided expertise on lexicography and dictionary encoding. My own role was overall project management, and participation in the development of an XML-based encoding scheme for the dictionaries.

The other project partners were Bulgarian Academy of Sciences, Sofia, Bulgaria; Charles University, Prague, Czech Republic; University of Tartu, Tartu, Estonia; Hungarian Academy of Science, Budapest, Hungary; Research Institute for Informatics, Bucharest, Romania; Josef Stefan Institute, Ljublijana, Slovenia.

Text Mining Demonstrator
1998-1999
Development of a text mining demonstrator.
Funding: BG Technology
Budget: £14,500
Timescale: August-November 1998, May-July 1999
Brighton role: participant
My role: co-investigator (with L.Cahill)

This project was carried out by the ITRI (myself and Lynne Cahill) and Integral Solutions Ltd on behalf of BG Technology. The aim was to produce a text mining demonstrator, by taking two pieces of existing software (the POETIC/MUC information extraction system and ISL's Clementine data mining system), and bolt them together, making it possible to mine information in textual as well as symbolic/numeric databases. The initial application domain was newswire stories about power generation: the information extraction system returned key information about participants, location, type, size and cost of new power stations, allowing the data mining system to discover long term trends in the market directly from a newswire feed.

In a second phase in 1999, we successfully adapted the system to operate in a new related application domain, liquefied natural gas.

SENSEVAL
1998-1999
A Manually Sense-tagged Gold Standard Corpus.
Funding: EPSRC
Budget: £10,255
Timescale: May 1998, for nine months
Brighton role: sole participant
My role: principal investigator (with Adam Kilgarriff)
Further information: http://www.itri.bton.ac.uk/events/senseval

This project provided lexicography support for the SENSEVAL word sense disambiguation system evaluation. In SENSEVAL, participants from all over the world tested their word sense disambiguation systems against manually sense-tagged data in a MUC-style evaluation. In September 1998 the results were compared any discussed at the SENSEVAL workshop. The funding for this project paid for the preparation of training and test data for the task.

My role in the project was largely management support: Adam Kilgarriff was the primary author of the proposal and is the de facto project manager.

SEAL
1995-1998
Structural enhancement of automatically-acquired lexicons
Funding: EPSRC grant no. GR/K18931
Budget: £117,000
Timescale: March 1995, for three years
Brighton role: sole participant
My role: principal investigator
Further information: http://www.itri.bton.ac.uk/projects/seal

This project addressed the problem of lexical tuning: how to develop a customised lexicon for a given application domain by using existing resources. The project adopted a corpus-based approach, drawing significantly on the British National Corpus for its input data. The main achievements of the project were:

  • techniques for encoding and optimising complex syntactic information in the lexicon
  • a framework for assess similarity of different domain corpora and identifying distinctive words between them
  • a constructive view of word senses and their role in both lexicography and lexical tuning
  • a major review and evaluation of word sense disambiguation systems (which led to the SENSEVAL initiative described above)
  • a widely-used frequency annalysis of the British National Corpus


Advanced Fellowship
1988-1994
SERC Advanced Fellowship.
Funding: SERC, grant no. B/ITF/187
Budget: £86,000
Timescale: October 1988, for 5 years (suspended, April 1989 - March 1990)
My role: personal fellowship
SPL
1990-1992
An Integrated Approach to Structure and Processing in Natural Languages.
Funding: SERC, grant no. GR/F01000
Budget: £46,000
Timescale: January 1990, for three years
My role: principal investigator

In October 1988, I was awarded an SERC Advanced Fellowship to study the relationship between language structure and processing, and its relevance to the design of linguistic formalisms. In 1990 I also obtained a personal SERC research grant to further support this work. This research was pursued on two fronts. On the one hand, I explored the theoretical and representational issues through the language DATR, whilst on the other, I looked at the application of these ideas to processing through the work on message understanding (POETIC, MUC5) and to a limited extent natural language generation (DRAFTER, GIST).

The results of this activity can be found in the development of the DATR compiler, and a wide range of DATR fragments exploring different representational ideas; in the grammar and lexicon of the POETIC system, and its adaptation in the MUC5 system; and in subsequent work in SEAL, on lexical access and multilingual representation.

MUC5
1993
Sussex University's participation in MUC5.
Funding: US Government (ARPA), Sussex University, RACAL Research Ltd and Integral Solutions Ltd
Budget: £31,000
Timescale: March 1993, for six months
Sussex role: sole participant
My role: co-investigator (with R. Gaizauskas)

Robert Gaizauskas and I negotiated funding from US and UK sources for participation in the 5th Message Understanding Conference (MUC5), as the first European team to take part in the MUC series. This grant funded ten person-months of effort to attempt to port the POETIC natural language understanding component to a completely different topic domain (commercial joint ventures) and evaluate the resulting system against other state of the art message understanding systems. Based on this evaluation our system was placed in the third statistically-significant rank - only three systems (out of an international field of thirteen) were ranked higher. For further details see "MUCing about in Pop: Message Understanding in Five Languages" (Evans, Gaizauskas and Cahill 1993) or "Sussex University: Description of the Sussex System used for MUC5" (Gaizauskas, Cahill and Evans 1994).

My roles in this project were liaison with funding consortium, development of the phrasal lexicon analysis module and general technical support for the other team members (R. Gaizauskas and L.J. Cahill).

POETIC
1990-1993
Portable Extendable Traffic Information Collator.
Funding: SERC, grant no. GR/F39195, part of the IEATP/IKBS programme, project IED4/1/1834
Budget: £278,000
Timescale: 1990, for forty two months
Sussex role: partner
My role: project manager for Sussex
Further information: http://www.cogs.susx.ac.uk/lab/nlp/poetic/poetic.html
TIC
1989-1990
The Traffic Information Collator.
Funding: SERC, part of the Mobile Information Systems Consortium (MISC) Alvey demonstrator project.
Timescale: 1986, for forty two months
Sussex role: partner
My role: project manager for Sussex for final year

The overall aim of the "Traffic Information Collator" (TIC) project was to develop a system which analysed natural language text as found in police command and control logging systems, picked out messages about traffic incidents and automatically produced appropriately targeted traffic bulletins for other motorists. In April 1989, I worked on this project for a year, following the departure from Sussex of the principal investigator and senior research fellow on the project. I took over the management responsibility and supervision of the second research fellow (A.F. Hartley).

POETIC was a follow-on project, which aimed to take the basic prototype developed under TIC and generalise it to be portable to a wide range of police sublanguages, traffic management policies and geographical area. The POETIC consortium was led by RACAL Research Ltd, and included the University of Sussex, the Automobile Association and National Transcommunications Ltd.

As I was not myself funded by the POETIC project, I took the role of project manager, supervising the two research staff on the project (L.J. Cahill and R. Gaizauskas) and liaising with the industrial partners. As well as providing general guidance for the research as a whole, I made specific contributions in the redevelopment of the lexicon, the overall architecture of the system, and the user interface. See "POETIC: A System for Gathering and Disseminating Traffic Information" (Evans et al, 1996) for further details.

As participant or collaborator

WYSIWYM
1997-present
WYSIWYM: knowledge editing with natural language feedback.
My role: Co-developer (with Richard Power and Donia Scott)
Further information: http://www.itri.bton.ac.uk/projects/WYSIWYM/wysiwym.html

WYSIWYM ('What you see is what you meant') is a technique for using natural language generation technology to support complex data-entry tasks, such as the development of a knowledge base or a complex formal query (such as SQL, or the legal query representations used in the CLIME system, described above). The core idea of WYSWIYM was introduced by Richard Power, and subsequently developed by Power, Donia Scott and myself in the context of various projects and other initiatives. The primary applications to date have been authoring of multilingual 'Patient Information Leaflets', explorations into stylistic variation, the CLIME legal enquiry system, and management of medical records.

As well as contributing to the general development of this work, I have specific involvement in the development of a WYSIWYM library in Java, suitable for deployment in other applications, and management and negotiation of licencing of the software.

HALO/DarkMatter
2004-2005
Project HALO - DarkMatter consortium
Funding: Vulcan Inc.
Timescale: October 2004, for two years
Principal investigator: Richard Power
My role: Technical lead on WYSIWYM interface library (until project relocated to the Open University in April 2005)
Further information: http://www.projecthalo.com

Project Halo is an effort by Vulcan Inc. towards the development of a “Digital Aristotle” — a staged, long-term research and development initiative that aims to develop an application capable of answering novel questions and solving advanced problems in a broad range of scientific disciplines. The Digital Aristotle is being developed with a focus on two primary functions: as a tutor capable of instructing and assessing students in the sciences, and as a research assistant with broad, interdisciplinary skills to help scientists in their work.

DarkMatter is one of two competing Halo subprojects, led by Ontoprise GmbH, aiming to deliver phase II of the Halo project, developing technology that will allow domain experts to formulate knowledge with decreasing dependence on knowledge engineers, and for untrained users to pose questions and problems to the knowledge systems. Our role in the project was to investigate the deployment of WYSIWYM technology for knowledge creation and user query. My own specific role was to manage the development and application of the Java WYSIWYM library for use in DarkMatter, and to negotiate licencing arrangements between Brighton and the other partners.

Semantic Mining
2004-2005
Semantic interoperability and data mining in Biomedicine
Funding: EU
Timescale: January 2004, for three years
Principal investigator: Donia Scott
My role: Site leader for workpackage on multilingual medical lexicons (until project relocated to the Open University in April 2005)
Further information: http://www.itri.bton.ac.uk/projects/semanticmining

Semantic Interoperability and Data Mining in Biomedicine is a Network of Excellence (NoE) funded by the European Commission under Framework 6. The general objective of the network is to bridge gaps in European research infrastructure and to facilitate cross-fertilisation between scientific disciplines such as computer science, system engineering and medical/clinical research. The long-term goal of the network is the development of generic methods and tools supporting critical tasks in medical and biomedical informatics, such as, data-mining, knowledge discovery, knowledge representation, abstraction and indexing of information, semantic-based information retrieval in a complex and high-dimensional information space, and knowledge based adaptive systems for provision of decision support for dissemination of evidence based medicine.

M3
2000-2003
Methods and Models of Morphology Seminar.
Funding: ESRC
Brighton role: partner
My role: local coordinator
CID
1998-2000
Challenges for Inflectional Description Seminar.
Funding: ESRC
Brighton role: partner
My role: local coordinator
FRiM
1996-1998
Frontiers of Research in Morphology Seminar.
Funding: ESRC
Brighton role: partner
My role: local coordinator

These three ESRC grants funded successive two-year series of quarterly one-day seminars on aspects of morphological description. The grants were jointly held by Universities of Surrey, Sussex, Brighton, Essex, Cambridge and SOAS, and funded travel and subsistence for participating groups and invited international speakers to discuss morphological theory in general, and its application to particular languages, notably endangered languages, such as indigenous languages of Australia, Polynesia, North America and Africa.

PILLS
2001
Pharmaceutical Instructions Language Localisation System.
Funding: EU
Timescale: January 2001, for one year
Principal Investigators: Donia Scott & Richard Power
My role: research associate
Further information: http://www.itri.bton.ac.uk/projects/pills

This project developed a prototype demonstrator of a multilingual authoring tool for pharmaceutical information in a range of forms, suitable for patients, nurses and doctors. I played a primarily technical role in the project, adapting the web-delivered version of WYSIWYM that was developed in the CLIME project for use as the demonstrator interface.

RAGS
1998-2001
RAGS: Reference Architecture for Generation Systems.
Funding: EPSRC
Timescale: April 1998, for three years
Principal Investigators: Donia Scott & Chris Mellish
My role: co-investigator
Further information: http://www.itri.bton.ac.uk/projects/rags

The aim of this project was to develop a standard 'reference architecture' for natural language generation systems. The lack of a standard view of the generation process as a whole is a signficant barrier to wider exploitation of the technology. This project aimed to develop such a view, building on the emerging consensus evident in practical generation systems. As well as defining the reference architecture, the project aimed to deliver resources (sample knoweldge sources and test data) to support the development of applications based on the architecture.

I was a co-author of the proposal (but at the time ineligible to be an official investigator), and was fully involved in the management and execution of this project. Particular areas I have been involved with include the formalisation of RAGS datatypes, the development of the RAGS data model, and the development of the OASYS library, which provides support for modular cooperatively-multitasking event-driven NLG applications in prolog.

DRAFTER
1993-1997
DRAFTER: Drafting Assistant for Technical Writers.
Funding: EPSRC
Timescale: November 1993, for thirty nine months
Principal Investigator: Donia Scott
My role: research associate
Further information: http://www.itri.bton.ac.uk/projects/drafter
GIST
1993-1996
Generating instructional texts.
Funding: EU LRE initiative
Timescale: December 1993, for thirty months
Principal Investigator: Donia Scott
My role: research associate
Further information: http://ecate.itc.it:1025/projects/gist.html

These two research projects were both concerned with the multilingual generation of instructional texts in different application areas. DRAFTER was concerned with the generation of software manuals in English and French, while in GIST we were looking at the production of administrative forms (such as pension application forms) in English, German and Italian. For both projects, the basic approach was to replace conventional authoring followed by translation (manual or automatic), by a process of symbolic authoring - representing the content of the document symbolically, in a form from which texts in several languages can be generated automatically in parallel.

My roles in these projects were in project management (including representing the GIST consortium at EC project meetings in Luxembourg), as well as technical involvement in the overall architecture design, and associated work on multilingual lexical representation.

Poplog
1987-1988
Poplog development work.
Timescale: 1987, for one year
Principal Investigator: A. Sloman
My role: research fellow
Further information: http://www.poplog.org

Throughout my time at Sussex, I maintained a keen interest in the Poplog programming system, an advanced multi-language environment developed within Cognitive and Computing Sciences at Sussex. From October 1987 to September 1988, I was employed as a full-time research fellow as part of the Poplog development team. My most significant contributions were the design and implementation of the interface to X Windows which is now a central component of the system, the addition of 'destroy actions' to the garbage collector and the development of an interface to UNIX signal handling.

NLGP
1984-1987
Natural Language Generation from Plans.
Funding: SERC
Timescale: 1984, for three years
Principal investigator: Chris Mellish
My role: research fellow

In this project we developed a generation system which took a plan representation for the performance of some task and used the plan structure to guide the generation of multi-paragraph description of how the task can be achieved. My primary area of responsibility in the project was the transformation of the plan structure into discourse structure using 'algebraic' techniques. See "Natural Language Generation from Plans" (Mellish and Evans 1989) for further details.

ProGram
1982-1984
Computer Realisation of a Grammar.
Funding: ESRC
Principal investigator: Gerald Gazdar
My role: part time research assistant

The aim of this project was to develop some kind of computer representation of a moderately large grammar represented in the (then) quite novel "Generalised Phrase Structure Grammar" formalism. As a part time programmer on this project while studying for my D.Phil, I developed a very detailed understanding of GPSG at that time, which subsequently persuaded me to change my thesis topic (see "Education")

The project resulted in the ProGram system, an early instance of a "Grammar Development System" and probably the first full, faithful implementation of the GPSG formalism. I was responsible for much of the specification and design, and all the implementation of the system (in Prolog). For further details see "ProGram - a development tool for GPSG grammars" (Evans 1985).


16 January 2006 INDEX NEXT