Grammar Engineering: Parsing with HPSG Grammars
description
Transcript of Grammar Engineering: Parsing with HPSG Grammars
Grammar Engineering:
Parsing with HPSG Grammars
Miguel Hormazábal
Overview
The Parsing Problem
Parsing with constraint-based grammars
Advantages and drawbacks
Three different approaches
The Parsing Problem
Given a Grammar and a Sentence,
Can the < S, Θ > generate / rule out the input String ?
A candidate sentence must satisfy all the principles of the Grammar
Coreferences as main explanatory mechanism in HPSG
Parsing with Constraint-based Grammars
Object-based formalism Complex specifications on signs Structure sharing imposed by the theory
Feature Structures Sort resolved and well typed Multiple information levels (PHON, SYNSEM)
Universal / Language specific principles to be met
Advantages and Drawbacks
Pros:
A common formalism for all levels of linguistic Information
All information simultaneously available
Cons:
Hard to modularize
Computational overhead for parser
1st Approach: Distributed ParsingTwo kind of constraints: Genuine: syntactic, they work as filters of the input Spurious: semantic, they build representational structures
Parser cannot distinguish between analytical and structure-building constraints
VERBMOBIL implementation: Input: word lattices of speech recognition hypotheses Parser identifies those paths of acceptable utterances Lattices can contain hundreds of hypotheses, most ungrammatical
Goal: Distribute the labour of evaluating the constrains in the grammar on several processes
Distributed Parsing
Analysis strategy:Two parser units:
SYN-Parser:
Works directly with word lattices Performs as a filter for the SEM-Parser
SEM-Parser:
Works only with successful analysis results Performs under control by the SYN-Parser
Distributed Parsing
Processing requirements:
Incrementality: The SYN-Parser must NOT send its results only when it has
complete analysis, forcing the SEM-Parser to wait
Interactivity: The SYN-Parser must report back when its hypothesis failed
Efficient communication system between the parsers, based on the common grammar
Distributed ParsingCentralized Parsing
Distributed Parsing
Distributed Parsing
Bottom-Up Hypotheses Emitted by the SYN-Parser and sent to SEM-Parser, for semantic
verification
Top-Down Hypotheses Emitted by the SEM-Parser, failures reported back to SYN-Parser
Completion HistoryC-hist(NP-DET-N) := ((DET t0 t1) (N t’1 t2))
C-hist(det) := ((“the” t0 t1))
C-hist(N) := ((“example” t’1 t2))
Distributed Parsing
Compilation of Subgrammars
From common source Grammar,
Straightforward option: split up the Grammar into syntax and semantics strata
Manipulating grammar rules and lexical entries to obtain: Gsyn and Gsem
2nd Approach: Data-Oriented Parsing
Main goal: achieve domain adaptation to improve efficiency of HPSG parsing
Assumption: frequency and plausibility of linguistic structures within a certain domain, will render better results
DOP process new input by combining structure fragments from a Treebank
DOP allows to assign probabilities to arbitrarily large syntactic constructions
Data-Oriented Parsing
Procedure:
Parse all sentences from a training corpus using HPSG Grammar and Parser
Automatic acquisition of a stochastic lexicalized tree grammar (SLTG)
Each parse tree is decomposed into a set of subtrees.
Assignment of probabilities to each subtree
Data-Oriented Parsing
Implementation using unification-based Grammar, parsing and generation platform: LKB
First parse each sentence of the training corpus
The resulting Feature Structure contains the parse tree
Each non-terminal node contains the label of the HPSG-rule schema applied
Each terminal node contains lexical type of the corresponding feature structure
After this, each parse tree is further processed
Data-Oriented Parsing
1. Decomposition, two operations:
Root creates ‘passive’ (closed, complete) fragments by extracting substructures
Frontier creates ‘active’ (open, incomplete) fragments by deleting pieces of substructure
Each non-head subtree is cut off, and the cutting point is marked for substitution.
Data-Oriented Parsing
2. Specialization Rule labels of root node and substitution nodes are replaced with
a corresponding category label.Example:
signs with local.cat.head value of type noun, and
local. cat.val.subj feature the empty list, are classified as NPs.
3. Probability Count total number n of all trees with same root label α Divide frequency number m of a tree t with root α by n p(t) The sum of all probabilities of trees ti with root α 1
Σ ti : root(ti) = α p(ti) = 1
Data-Oriented Parsing
This implementation for the VerbMobil project uses a
chart-based agenda-driven bottom-up parser
Step 1: Selection of a set of SLTG-trees associated with the lexical items in the input sentence
Step 2: Parsing of the sentence with respect to this set.
Step 3: Each SLTG-parse tree is “expanded” by unifying the feature constraints into the parse trees
If successful, complete valid feature structure
Else, next most likely tree is expanded
3rd Approach: Probabilistic CFG Parsing
Main goal: to obtain the Viterbi parse (highest probability) given an HPSG and a probabilistic model
One way: Parse input without using probabilities Then select most probable parse looking at every result Cost: Exponential search space
This Approach: Define equivalence class function (F.S. reduction) Integrate SEM and SYN preference into Figures Of Merit (FOMs)
Probabilistic CFG Parsing
Probabilistic Model:
HPSG Grammar: G = < L, R >, where
L = { l = < w, F > | w Є W, F Є F } set of lexical entries
R is a set of grammar rules, i.e., r Є R is a partial function:
F x F -> F
Probabilistic CFG Parsing
Probabilistic HPSG:
Probability p(F | w) of F.S. Assign to given sentence:
Where λi is a model parameter,
si is a fragment of a F.S., and
σ (si , F )is a function of N of appearences of F.S. fragment
si in F
Probabilities represent syntactic/semantic preferences expressed in a Feature Structure
Probabilistic CFG Parsing
Implementation: Iterative CYK parsing algorithm Pruning edges during parsing Best N parses are tracked
Reduced F.S.E though equivalence classes Requires not over/undergenerate FOMs computed with reduced F.S. Equivalent to original
Parser calculates Viterbi, taking maximum of probabilities of the same non terminal symbol at each point
Assessment
The three approaches attempt to achieve a higher efficiency of the Parsing process Distributed Parsing
Distributed Parsing: Unification and copying faster
Soundness of Grammar affected L(G) L(⊂ Gsyn) ∩ L(Gsem)
DO Parsing Fragment at the right level of generality
Straightforward Probability computation
PCFG Parsing Highly efficient CYK parsing implementation trough reduced FS and edge pruning
References
Pollard, C. and Sag, I. A. (1994). Head-Driven Phrase Structure Grammar . Chicago, IL: University of Chicago Press.
Richter, F. (2004b). A Web-based Course in Grammar Formalisms and Parsing. Textbook, MiLCA project A4, SfS, Universit¨at T¨ubingen. http://milca.sfs.uni-tuebingen.de/A4/Course/PDF/gramandpars.pdf.
Levine Robert, and Meurers Detmar. Head-Driven Phrase Structure Grammar: Linguistic Approach, Formal Foundations, and Computational Realization In Keith Brown (Ed.): Encyclopedia of Language and Linguistics, Second Edition. Oxford: Elsevier. 2006.
Abdel Kader Diagne, Walter Kasper, and Hans-Ulrich Krieger. (1995). Distributed Parsing With HPSG Grammars. In Proceedings of the 4th International Workshop on Parsing Technologies, IWPT-95, pages 79–86.
Neumann, G.HPSG-DOP: data-oriented parsing with HPSG. In: Unpublished manuscript, presented at the 9th Int. Conf. on HPSG, HPSG-2002, Seoul, South Korea (2002)
Tsuruoka Yoshimasa, Miyao Yusuke, and Tsujii Jun'ichi. 2003. Towards efficient probabilistic HPSG parsing: integrating semantic and syntactic preference to guide the parsing. Proceedings of IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses.