Aprendizaje de gramáticas probabilísticas a partir de árboles sintácticos

Verdú Más, José Luis

Aprendizaje de gramáticas probabilísticas a partir de árboles sintácticos

Verdú Más, José Luis

Revista:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Ano de publicación: 2003

Número: 31

Páxinas: 175-182

Tipo: Artigo

DIALNET GOOGLE SCHOLAR RUA editor

Outras publicacións en: Procesamiento del lenguaje natural

Resumo

In this paper, we compare three different approaches to build a probabilistic context-free grammar for natural language parsing from a tree bank corpus: (1) a model that simply extracts the rules contained in the corpus and counts the number of occurrences of each rule; (2) a model that also stores information about the parent node's category, and (3) a model that estimates the probabilities according to a generalized k-gram scheme for trees with k = 3. The last model allows for faster parsing, decreases considerably the perplexity of test samples and may be seen as a generalization of the classic n-gram models to the case of trees.

Fonte de datos: Dialnet