Grammar Engineering: Parsing with HPSG Grammars

Grammar Engineering:

Parsing with HPSG Grammars

Miguel Hormazábal

Overview

The Parsing Problem

Parsing with constraint-based grammars

Advantages and drawbacks

Three different approaches

The Parsing Problem

Given a Grammar and a Sentence,

Can the < S, Θ > generate / rule out the input String ?

A candidate sentence must satisfy all the principles of the Grammar

Coreferences as main explanatory mechanism in HPSG

Parsing with Constraint-based Grammars

Object-based formalism Complex specifications on signs Structure sharing imposed by the theory

Feature Structures Sort resolved and well typed Multiple information levels (PHON, SYNSEM)

Universal / Language specific principles to be met

Advantages and Drawbacks

Pros:

A common formalism for all levels of linguistic Information

All information simultaneously available

Cons:

Hard to modularize

Computational overhead for parser

1st Approach: Distributed ParsingTwo kind of constraints: Genuine: syntactic, they work as filters of the input Spurious: semantic, they build representational structures

Parser cannot distinguish between analytical and structure-building constraints

VERBMOBIL implementation: Input: word lattices of speech recognition hypotheses Parser identifies those paths of acceptable utterances Lattices can contain hundreds of hypotheses, most ungrammatical

Goal: Distribute the labour of evaluating the constrains in the grammar on several processes

Distributed Parsing

Analysis strategy:Two parser units:

SYN-Parser:

Works directly with word lattices Performs as a filter for the SEM-Parser

SEM-Parser:

Works only with successful analysis results Performs under control by the SYN-Parser

Distributed Parsing

Processing requirements:

Incrementality: The SYN-Parser must NOT send its results only when it has

complete analysis, forcing the SEM-Parser to wait

Interactivity: The SYN-Parser must report back when its hypothesis failed

Efficient communication system between the parsers, based on the common grammar

Distributed ParsingCentralized Parsing

Distributed Parsing

Distributed Parsing

Bottom-Up Hypotheses Emitted by the SYN-Parser and sent to SEM-Parser, for semantic

verification

Top-Down Hypotheses Emitted by the SEM-Parser, failures reported back to SYN-Parser

Completion HistoryC-hist(NP-DET-N) := ((DET t0 t1) (N t’1 t2))

C-hist(det) := ((“the” t0 t1))

C-hist(N) := ((“example” t’1 t2))

Distributed Parsing

Compilation of Subgrammars

From common source Grammar,

Straightforward option: split up the Grammar into syntax and semantics strata

Manipulating grammar rules and lexical entries to obtain: Gsyn and Gsem

2nd Approach: Data-Oriented Parsing

Main goal: achieve domain adaptation to improve efficiency of HPSG parsing

Assumption: frequency and plausibility of linguistic structures within a certain domain, will render better results

DOP process new input by combining structure fragments from a Treebank

DOP allows to assign probabilities to arbitrarily large syntactic constructions

Data-Oriented Parsing

Procedure:

Parse all sentences from a training corpus using HPSG Grammar and Parser

Automatic acquisition of a stochastic lexicalized tree grammar (SLTG)

Each parse tree is decomposed into a set of subtrees.

Assignment of probabilities to each subtree


Implementation using unification-based Grammar, parsing and generation platform: LKB

First parse each sentence of the training corpus

The resulting Feature Structure contains the parse tree

Each non-terminal node contains the label of the HPSG-rule schema applied

Each terminal node contains lexical type of the corresponding feature structure

After this, each parse tree is further processed


1. Decomposition, two operations:

Root creates ‘passive’ (closed, complete) fragments by extracting substructures

Frontier creates ‘active’ (open, incomplete) fragments by deleting pieces of substructure

Each non-head subtree is cut off, and the cutting point is marked for substitution.


2. Specialization Rule labels of root node and substitution nodes are replaced with

a corresponding category label.Example:

signs with local.cat.head value of type noun, and

local. cat.val.subj feature the empty list, are classified as NPs.

3. Probability Count total number n of all trees with same root label α Divide frequency number m of a tree t with root α by n p(t) The sum of all probabilities of trees ti with root α 1

Σ ti : root(ti) = α p(ti) = 1


This implementation for the VerbMobil project uses a

chart-based agenda-driven bottom-up parser

Step 1: Selection of a set of SLTG-trees associated with the lexical items in the input sentence

Step 2: Parsing of the sentence with respect to this set.

Step 3: Each SLTG-parse tree is “expanded” by unifying the feature constraints into the parse trees

If successful, complete valid feature structure

Else, next most likely tree is expanded

3rd Approach: Probabilistic CFG Parsing

Main goal: to obtain the Viterbi parse (highest probability) given an HPSG and a probabilistic model

One way: Parse input without using probabilities Then select most probable parse looking at every result Cost: Exponential search space

This Approach: Define equivalence class function (F.S. reduction) Integrate SEM and SYN preference into Figures Of Merit (FOMs)

Probabilistic CFG Parsing

Probabilistic Model:

HPSG Grammar: G = < L, R >, where

L = { l = < w, F > | w Є W, F Є F } set of lexical entries

R is a set of grammar rules, i.e., r Є R is a partial function:

F x F -> F


Probabilistic HPSG:

Probability p(F | w) of F.S. Assign to given sentence:

Where λi is a model parameter,

si is a fragment of a F.S., and

σ (si , F )is a function of N of appearences of F.S. fragment

si in F

Probabilities represent syntactic/semantic preferences expressed in a Feature Structure


Implementation: Iterative CYK parsing algorithm Pruning edges during parsing Best N parses are tracked

Reduced F.S.E though equivalence classes Requires not over/undergenerate FOMs computed with reduced F.S. Equivalent to original

Parser calculates Viterbi, taking maximum of probabilities of the same non terminal symbol at each point

Assessment

The three approaches attempt to achieve a higher efficiency of the Parsing process Distributed Parsing

Distributed Parsing: Unification and copying faster

Soundness of Grammar affected L(G) L(⊂ Gsyn) ∩ L(Gsem)

DO Parsing Fragment at the right level of generality

Straightforward Probability computation

PCFG Parsing Highly efficient CYK parsing implementation trough reduced FS and edge pruning

References

Pollard, C. and Sag, I. A. (1994). Head-Driven Phrase Structure Grammar . Chicago, IL: University of Chicago Press.

Richter, F. (2004b). A Web-based Course in Grammar Formalisms and Parsing. Textbook, MiLCA project A4, SfS, Universit¨at T¨ubingen. http://milca.sfs.uni-tuebingen.de/A4/Course/PDF/gramandpars.pdf.

Levine Robert, and Meurers Detmar. Head-Driven Phrase Structure Grammar: Linguistic Approach, Formal Foundations, and Computational Realization In Keith Brown (Ed.): Encyclopedia of Language and Linguistics, Second Edition. Oxford: Elsevier. 2006.

Abdel Kader Diagne, Walter Kasper, and Hans-Ulrich Krieger. (1995). Distributed Parsing With HPSG Grammars. In Proceedings of the 4th International Workshop on Parsing Technologies, IWPT-95, pages 79–86.

Neumann, G.HPSG-DOP: data-oriented parsing with HPSG. In: Unpublished manuscript, presented at the 9th Int. Conf. on HPSG, HPSG-2002, Seoul, South Korea (2002)

Tsuruoka Yoshimasa, Miyao Yusuke, and Tsujii Jun'ichi. 2003. Towards efficient probabilistic HPSG parsing: integrating semantic and syntactic preference to guide the parsing. Proceedings of IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses.

Grammar Engineering: Parsing with HPSG Grammars

Documents

Transcript of Grammar Engineering: Parsing with HPSG Grammars