Metaphonemes
Carole Tiberius and Lynne Cahill
This page gives a short introduction to the concept of metaphonemes,
what they are, what you can do with them, and how you find them.
What are metaphonemes?
Metaphonemes are phoneme correspondences that exist across languages. If
we know, for example, that words which are realised with an /{/ in English
are usually realised with an /A/ in Dutch, and an /a/ in German
(as in cat
/k{t/ versus /kAt/ versus /kats@/), we might be able to generalise over
these three language-specific phonemes and introduce a metaphoneme, e.g.
|{Aa|, which captures this generalisation.
 |
Dutch: /kAt/
English: /k{t/
German: /kats@/
|
(from the Clip Art Collection for FL instruction)
Depending on the language, this
metaphoneme will then be realised as an /{/ in English, an /A/ in Dutch,
and an /a/ in German.
What can you do with metaphonemes?
Our aim is to incorporate these metaphonemes in a multilingual
inheritance-based lexicon, which allows sharing of information across
languages at all levels of linguistic description as described in
PolyLex
(Cahill and Gazdar 1999). The basic idea behind the PolyLex architecture
is that abstraction across languages can be represented within a standard
default inheritance hierarchy in virtually the same way as abstraction
within a language. Thus, information which is common to several languages
will be stated at higher points in the hierarchy than information which
is language-specific.
The PolyLex project defined a trilingual hierarchical lexicon for Dutch,
English, and German sharing morphological, phonological and
morphophonological similarities between these three languages.
Here, we take the PolyLex framework as our basis.
We focus on the phonological similarities between related languages and
we extend the PolyLex approach by capturing cross-linguistic
phoneme correspondences, such as the /{/ - /A/ - /a/ correspondence mentioned
above.
The assumption is that the introduction of metaphonemes
increases the amount of phonological generalisations that can be captured
between languages at the syllable level in a multilingual inheritance
lexicon.
How do you find metaphonemes?
The vowel phoneme inventories of the Germanic languages were taken as a
testbed. We compiled a list of 800 (mono- and disyllabic) Germanic cognates
(see "Resources" below),
looked up their transcriptions, and then
mapped words containing a particular vowel in one language onto its cognates
in the other languages, to see how a particular vowel was realised in the
other languages. This process was repeated for all vowels, for all languages
of the test set. In a first pilot study, we defined vowel phoneme
correspondences for three West Germanic languages - Dutch, English, and German.
The phonemic transcriptions were taken from the CELEX database (Baayen et al.
1995).
The resulting tables can be found here
In a second pilot study, we extended our test set with a related but
less-closely related language, i.e. Danish, a North Germanic language.
The transcriptions for Danish were taken from (Hansen 1990).
The resulting tables can be found here.
The next steps
We have had ESRC funding to complete this work, giving an account of the
consonants in English, Dutch and German. This project (ESRC grant
no:R000223681) employed Lynne Cahill for six months (August 2001 -
February 2002) to undertake the further analyses.
The correspondence tables for the consonants of English, Dutch and German can
now be found here.
Resources
Papers
-
Tiberius, C. & Cahill, L.J. (2000). Incorporating Metaphonemes
in a Multilingual Lexicon. Proceedings of the 18th International Conference
on Computational Linguistics (COLING 2000), Saarbruecken, Germany,
pp. 1126-1130. Available here (postscript).
-
Tiberius, C. & Cahill, L.J. (2000). A MetaPhoneme Inventory.
Proceedings
of the 10th CLIN meeting, Utrecht, The Netherlands, pp. 193-200.
Available here (gzipped
postscript).
Datasets
- Cognate database: The current version of the
database, including English, Dutch and German orthographic and phonological
forms. Also included is Danish for as many as possible. A version with the
Germanic roots will be available shortly.
- Multilingual lexicon with metaphonemes: The lexicon
of nouns from PolyLex, in DATR,
including English, German and Dutch noun morphology,
phonology and morphophonology, adapted to include metaphoneme definitions
where either a full metaphoneme definition or a partial metaphoneme
definition applies. Here is a downloadable
version (gzipped tar file).
The transcriptions used on this page are based on the SAMPA
phonetic alphabet.
References
Baayen, H., R. Piepenbrock and H. van Rijn. 1995. The CELEX Lexical
Database. Release 2 (CD-ROM), Linguistic Data Consortium, University
of Pennsylvania, Philadelphia, PA.
Cahill, L. and G. Gazdar. 1999. "The POLYLEX architecture: multilingual
lexicons for related languages". In Traitement Automatique des Langues
40:1, pp.7-25.
Hansen, P.M. 1990. Udtaleordbog. Gyldendal. Copenhagen.
This page was created by Carole Tiberius. It is
maintained by Lynne
Cahill and was last modified in September 2001
| Home page |