Using Computational Models to Test Syntactic Learnability

Paper · Source

We study the learnability of English filler—gap dependencies and the “island” constraints on them by assessing the generalizations made by autoregressive (incremental) language models that use deep learning to predict the next word given preceding context. Using factorial tests inspired by experimental psycholinguistics, we find that models acquire not only the basic contingency between fillers and gaps, but also the unboundedness and hierarchical constraints implicated in the dependency. We evaluate a model’s acquisition of island constraints by demonstrating that its expectation for a filler—gap contingency is attenuated within an island environment.

It is special in that it can span over a potentially unbounded number of nodes in a syntactic tree, yet it is subject to a subtle set of constraints known as island constraints (Ross, 1967). For example, in the grammatical sentence in (1-a), the dependency between the filler and the gap spans two sentential embeddings. However, a similar sentence, (1-b), is rendered ungrammatical when the gap site resides within a syntactic ‘island’, in this case a Complex Noun Phrase.

(1)

a. I know what the guide said his friend saw the lion devour last night.

b. *I know what the guide saw the lion that devoured last night.