Look at that! BERT can be easily distracted from paying attention to morphosyntax

Rui P. Chaves, Stephanie N. Richter

February 2021

DOI Code

Abstract

Syntactic knowledge involves not only the ability to combine words and phrases, but also the capacity to relate different and yet truth-preserving structural variations (e.g. passivization, inversion, topicalization, extraposition, clefting, etc.), as well as the ability to infer that these syntactic variations all adhere to common morphosyntactic rules, like subject-verb agreement. Although there is some evidence that BERT has rich syntactic knowledge, our adversarial approach suggests that it is not deployed in a robust and linguistically appropriate way. English BERT can be tricked to miss even quite simple syntactic generalizations, when compared with GPT-2, underscoring the need for stronger priors and for linguistically controlled experiments in evaluation.

Type

Conference paper

Publication

Proceedings of the Society for Computation in Linguistics