Isabelle Dautriche
A few research projects:

Homophones: a challenge for meaning(s) acquisition?

with Emmanuel Chemla and Anne Christophe

Homophones present the learner with a unique word-learning situation. While most words conform to a one-to-one mapping between form and meaning, a homophone is a phonological form associated with several unrelated meanings. Mature language users use a diverse array of information sources (visual, linguistic…) to reach the correct interpretation of these words. However, children in the process of learning their language are faced with a different situation: they need to discover that a single word-form maps onto several distinct meanings.

In two sets of studies, we explore the situations that lead children and toddlers to postulate homophony for a word. First with preschoolers learning novel words, we show that the distribution of the learning exemplars in conceptual space influences children’s inferences: observing exemplars clustered around two distant positions in conceptual space (e.g., 2 tigers and 2 beetles as opposed to 4 random animals) boosted the likelihood that the exemplars were sampled from two independent categories rather than from a single superordinate category (i.e., animal). Second, we show that 20-month-olds are willing to learn a second meaning for a word they know, provided that the two homophones are sufficiently distant syntactically (e.g. ‘an eat’ is a good name for a novel animal), or semantically (e.g. ‘a sweater’ for a novel animal), but not when they are close (e.g. ‘a cat’ for a novel animal). Taken together, our results show that children recruit multiple sources of information to infer whether or not a given word-form is likely to instantiate a novel meaning.

Quantitative analysis of the lexicon

with Kyle Mahowald, Edward Gibson and Steven Piantadosi

Why do languages have the words that they do instead of some other set of words? Recent evidence suggests that cognitive pressures associated with communicative efficiency and learnability affect the organization of the lexicon. On one hand, consistent with noisy channel models of language communication, the phonological distance between word forms should be maximized to avoid perceptual confusability (a pressure for sparsity). That is, a language should avoid having many confusable words like feb and fep. On the other hand, a lexicon with high phonetic regularity would be simpler to learn, remember and produce (a pressure for clumpiness). Here, we investigate whether the similarity of words in the lexicon can be explained by one or both of these pressures.

We showed that (1) The most frequent forms in a language tend to be more phonotactically well-formed and have more phonetic neighbors than less frequent forms; (2) Natural lexicons have more similar words than would be expected by chance, over and above the constraints imposed by phonotactics; (3) More semantically similar word pairs are also more phonetically similar. We argue that, taken together, these results provide strong evidence that there is a functional advantage for lexicons to have words which are tightly clustered together in phonological space and that this advantage is even greater when words are semantically similar.

Using contextual information to constrain the learning problem

with Emmanuel Chemla, Gabriel Synnaeve, Benjamin Boerschinger, Mark Johnson and Emmanuel Dupoux

Word learners use many sources of information to segment the speech into words (e.g., statistical regularities; prosody; phonotactics) and to map words onto meanings (syntactic context; co-occurences; social information). Here I explored one potential source of information that has been underexplored in these two domains: the broader context of the learning situation. This idea rests on the observation that in the kitchen, one is more likely to speak about food than in the bathroom and conversely for bath items. It follows that the extra-linguistic context, which is naturally available to young children, helps learners in constraining both the segmentation problem and the mapping problem.

To evaluate its benefit for the segmentation problem, we built a computational model with extra-linguistic context and found that it outperformed a model without such contexts on a word segmentation task (Synnaeve et al., 2014, COLING). This suggests that using extra-linguistic context can boost the probability of specific vocabularies and constrain the most plausible segmentation of an utterance.

Not only children could use the extra-linguistic context to improve word segmentation but also for the acquisition of word meanings. Our findings demonstrate that extra-linguistic contexts help learners retrieve or discard potential referents for a word, because such contexts can be memorized and associated with a to-be-learned word (Dautriche and Chemla, 2014, JEP: LMC).

Bootstrapping the syntactic bootstrapper

with Alex de Carvalho, Ariel Gutman, Benoit Crabbé and Anne Christophe

One of the complex challenges faced by language learners is the acquisition of the syntactic structures of their native language. A rich literature suggests that a surface analysis of the speech signal could allow children to learn some aspects of the structure of their language (Morgan & Demuth, 1995). Specifically, since the prosodic structure of an utterance tends to coincide with its syntactic structure, children could exploit prosodic boundaries to detect syntactic boundaries and start building the skeleton of a syntactic structure (Christophe et al., 2008).

Our modelling results show that a partial syntactic representation based on the speech signal (prosody + function words) can be learnt via minimally supervised clustering and would be useful for syntactic categorization.

Our experimental results show that both groups of children exploit the position of a word within the prosodic structure when computing its syntactic category. Even two-year-olds exploit phrasal prosody on-line to constrain their syntactic analysis. This ability to exploit phrasal prosody to compute syntactic structure may help children parse sentences containing unknown words, and facilitate the acquisition of word meanings.


Email: isabelle.dautriche [at] gmail

Laboratoire de Sciences Cognitives et Psycholinguistique [Babylab]

Ecole Normale Supérieure, Pavillon jardin

29 rue d'Ulm

75005 Paris, FRANCE