Distinguishing 'possible' from 'probable' meaning shifts: How distributions impact linguistic theory
Speaker:
James Pustejovsky
Abstract:
In this talk, I discuss the changing role of data in modeling natural language, as captured in linguistic theories. The generative tradition of introducing data using only 'evaluation procedures', rather than 'discovery procedures', promoted by Chomsky in the 1950s, is slowly being unraveled by the exploitation of significant language datasets that were unthinkable in the 1960s. Evaluation procedures focus on possible generative devices in language without constraints from actual (probable) occurrences of the constructions. After showing how both procedures are natural to scientific inquiry, I describe the natural tension between data and the theory that aims to model it, with specific reference to the nature of the lexicon and semantic selection. The seeming chaos of organic data inevitably violates our theoretical assumptions. But in the end, it is restrictions apparent in the data that call for postulating structure within a revised theoretical model.