SIGMORPHON 2008 ONLINE | ACL 2008 ONLINE | ACL ONLINE


Tenth Meeting of ACL Special Interest Group on Computational Morphology and Phonology

WORKSHOP PROGRAM

Thursday, June 19, 2008

8:50–9:00Opening remarks
9:00–10:00Invited talk: Phonological Models in Automatic Speech Recognition
Karen Livescu
10:00–10:30Bayesian Learning over Conflicting Data: Predictions for Language Change
Rebecca Morley
10:30–11:00Break
11:00–11:30A Bayesian Model of Natural Language Phonology: Generating Alternations from Underlying Forms
David Ellis
11:30–12:00Unsupervised Word Segmentation for Sesotho Using Adaptor Grammars
Mark Johnson
12:00–14:00Lunch
14:00–15:00Invited talk: Counting Rankings
Jason Riggle
15:00–15:30Three Correlates of the Typological Frequency of Quantity-Insensitive Stress Systems
Max Bane and Jason Riggle
15:30–16:00Break
16:00–16:30Phonotactic Probability and the Maori Passive: A Computational Approach
‘Ōiwi Parker Jones
16:30–17:00Evaluating an Agglutinative Segmentation Model for ParaMor
Christian Monson, Alon Lavie, Jaime Carbonell and Lori Levin
17:00–17:30General discussion

 



Abstracts



Invited talk: Phonological Models in Automatic Speech Recognition

Karen Livescu

Click here for abstract.

Top


Bayesian Learning over Conflicting Data: Predictions for Language Change

Rebecca Morley

This paper is an analysis of the claim that a universal ban on certain ('anti-markedness') grammars is necessary in order to explain their non-occurrence in the languages of the world. To assess the validity of this hypothesis I examine the implications of one sound change (a > schwa) for learning in a specific phonological domain (stress assignment), making explicit assumptions about the type of data that results, and the learning function that computes over that data. The preliminary conclusion is that restrictions on possible end-point languages are unneeded, and that the most likely outcome of change is a lexicon that is inconsistent with respect to a single generating rule.

Top


A Bayesian Model of Natural Language Phonology: Generating Alternations from Underlying Forms

David Ellis

A stochastic approach to learning phonology. The model presented captures 7-15 percent more phonologically plausible underlying forms than a simple majority solution, because it prefers "pure" alternations. It could be useful in cases where an approximate solution is needed, or as a seed for more complex models. A similar process could be involved in some stages of child language acquisition; in particular, early learning of phonotactics.

Top


Unsupervised Word Segmentation for Sesotho Using Adaptor Grammars

Mark Johnson

This paper describes a variety of non-parametric Bayesian models of word segmentation based on Adaptor Grammars that try to model different aspects of the input and incorporate different kinds of prior knowledge, and applies them to the Bantu language Sesotho. While we find overall word segmentation accuracies lower than these models achieve on English, we also find some interesting differences in which factors contribute to better word segmentation. Specifically, we found little improvement to word segmentation accuracy when we modeled contextual dependencies, while modeling morphological structure did improve segmentation accuracy.

Top


Invited talk: Counting Rankings

Jason Riggle

Click here for abstract.

Top


Three Correlates of the Typological Frequency of Quantity-Insensitive Stress Systems

Max Bane and Jason Riggle

We examine the typology of quantity-insensitive (QI) stress systems and ask to what extent an existing optimality theoretic model of QI stress can predict the observed typological frequencies of stress patterns. We find three significant correlates of pattern attestation and frequency: the trigram entropy of a pattern, the degree to which it is "confusable" with other patterns predicted by the model, and the number of constraint rankings that specify the pattern.

Top


Phonotactic Probability and the Maori Passive: A Computational Approach

‘Ōiwi Parker Jones

Two analyses of Maori passives and gerunds have been debated in the literature. Both assume that the thematic consonants in these forms are unpredictable. This paper reports on three computational experiments designed to test whether this assumption is sound. The results suggest that thematic consonants are predictable from the phonotactic probabilities of their active counterparts. This study has potential implications for allomorphy in other Polynesian languages. It also exemplifies the benefits of using computational methods in linguistic analyses.

Top


Evaluating an Agglutinative Segmentation Model for ParaMor

Christian Monson, Alon Lavie, Jaime Carbonell and Lori Levin

This paper describes and evaluates a modification to the segmentation model used in the unsupervised morphology induction system, ParaMor. Our improved segmentation model permits multiple morpheme boundaries in a single word. To prepare ParaMor to effectively apply the new agglutinative segmentation model, two heuristics improve Paramor's precision. These precision-enhancing heuristics are adaptations of those used in other unsupervised morphology induction systems, including work by Hafer and Weiss (1974) and Goldsmith (2006). By reformulating the segmentation model used in ParaMor, we significantly improve ParaMor's performance in all language tracks and in both the linguistic evaluation as well as in the task based information retrieval (IR) evaluation of the peer operated competition Morpho Challenge 2007. Paramor's improved morpheme recall in the linguistic evaluations of German, Finnish, and Turkish is higher than that of any system which competed in the Challenge. In the three languages of the IR evaluation, our enhanced ParaMor significantly outperforms, at average precision over newswire queries, a morphologically naive baseline; scoring just behind the leading system from Morpho Challenge 2007 in English and ahead of the first place system in German.

Top


SIGMORPHON 2008 ONLINE | ACL 2008 ONLINE | ACL ONLINE