Abstract
Strings can be mapped into Hilbert spaces using feature maps such as the Parikh map. Languages can then be defined as the pre-image of hyperplanes in the feature space, rather than using grammars or automata. These are the planar languages. In this paper we show that using techniques from kernel-based learning, we can represent and efficiently learn, from positive data alone, various linguistically interesting context-sensitive languages. In particular we show that the cross-serial dependencies in Swiss German, that established the non-context-freeness of natural language, are learnable using a standard kernel. We demonstrate the polynomial-time identifiability in the limit of these classes, and discuss some language theoretic properties of these classes, and their relationship to the choice of kernel/feature map.
Original language | Undefined/Unknown |
---|---|
Title of host publication | Proceedings of the 8th International Colloquium on Grammatical Inference (ICGI) |
Pages | 148-160 |
Number of pages | 13 |
Volume | 4201 LNAI |
Publication status | Published - 2006 |