| | | | | Computational methods for generating language are lagging
behind computational methods for analysing language in several ways,
most obviously in that they have barely been used commercially. The
main reasons for this are that systems for generating language take
inordinate amounts of time to build, yet once built cannot be
reused, and tend to be severely lacking in language variation which
is easily perceived as lacking in quality. The current situation in
language generation research is reminiscent of language analysis
research in the late 1980s, when symbolic and statistical methods
briefly formed entirely separate research paradigms. Language
analysis soon moved towards a paradigm merger, realising that
symbolic methods lacked the efficiency and robustness that
probabilistic methods could provide, which in turn would benefit
from the accuracy and subtlety of symbolic methods. A similar
development is currently underway in the field of machine
translation where – after several years of purely statistical
methods dominating the field – researchers are now beginning to
bring linguistic knowledge back in. The experience from these
research fields suggests that higher quality can be achieved when
the symbolic and statistical paradigms join forces. Recent research
shows that this is likely to be true for language generation too.
The aim of the Prodigy project is to develop, for the first
time, a comprehensive, linguistically informed, probabilistic
methodology for generating language that substantially improves
development time, reusability and variation in language generation
systems, and thereby enhances their commercial viability.
|