An Assessment of "Raw" Phonotactic Learning
Bruce Hayes
UCLA
To what extent can constraints be learned if they're not guided by UG at all? Suppose in particular that we take the features of the feature set, combine them freely (both simultaneously and sequentially) into constraints, and simply see which ones are non-violated in a large data set. In this approach, lack of exceptions would serve as a diagnostic for constraint viability.
My results so far indicate that this strategy is a failure. It predicts that certain rare words (outside the data set) should be bad, when in fact they are acceptable. In other words, absence of counterexamples in a large data set does not reliably tell us that a constraint is valid for the whole language; it cannot distinguish the accidentally missing from the authentically missing.
The upshot is to impugn "raw" inductive learning as an account of how children learn phonotactics. This leaves as alternatives: (a) UG-based approaches; (b) phonetically-based approaches; (c) more thoughtful, better-designed inductive approaches. Some examples of the latter (e.g. Ellison, Pierrehumbert et al.) will be briefly discussed.