ABSTRACT
If we are ever to have faithful machine translation systems and articulate natural language generation systems, we will need a very sophisticated and principled model of lexical choice. One problem is that a language will often provide many near-synonyms for a word in another language that differ in only fine-grained nuances --- of denotation, style, expressed attitude, and usage. In this talk I will discuss a clustered model of lexical knowledge that explicitly represents the differences between near-synonyms. The near-synonyms in a cluster share the same core denotation, represented by means of an ontology of language-neutral concepts, and are differentiated subconceptually in terms of peripheral concepts (also defined in the ontology), stylistic features, and expressed attitude. I will also discuss a lexical choice process that uses the model. During lexical choice, the core denotation is used as a necessary applicability condition, activating all of the near-synonyms in the cluster. The most appropriate near-synonym, however, is the one whose peripheral concepts and stylistic features most closely match a set of preferences for expressing certain concepts in certain manners or for using certain styles, and that obeys constraints imposed by other aspects of sentence planning.