Abstract
A central problem for NLP is grammar induction: the development of unsupervised learning algorithms for syntax. In this paper we present a lattice-theoretic representation for natural language syntax, called Distributional Lattice Grammars. These representations are objective or empiricist, based on a generalisation of distributional learning, and are capable of representing all regular languages, some but not all context-free languages and some non-context-free languages. We present a simple algorithm for learning these grammars together with a complete self-contained proof of the correctness and efficiency of the algorithm.
Original language | English |
---|---|
Title of host publication | Proceedings of the Fourteenth Conference on Computational Natural Language Learning |
Publisher | Association for Computational Linguistics |
Pages | 28-37 |
Number of pages | 10 |
Publication status | Published - 1 Jul 2010 |