XQuery from the Experts Chapter 5 – Introduction to the formal Semantics Νίκος Λούτας.

Post on 20-Dec-2015

217 views 0 download

Transcript of XQuery from the Experts Chapter 5 – Introduction to the formal Semantics Νίκος Λούτας.

XQuery from the Experts

Chapter 5 – Introduction to the formal Semantics

Νίκος Λούτας

Outline

Getting started with the formal semantics Dynamic Semantics Environments Matching Values and Types Errors Static Semantics Type Soundness Evaluation order Normalization

Outline (cont’d)

Learning more about XQuery Values and Types Matching and Subtyping FLWOR Expressions Path Expressions Implicit Coercion and Function calls Node Identity and Element Constructors

Getting Started… The XQuery formal semantics describes a processing model

that relates: Query parsing

Takes as input a query and produces a parse tree Normalization

Transforms the parse tree in an equivalent parse tree in the core language

Static analysis Produces a parse tree where each expression has been

assigned a type Dynamic evaluation

Take a parse tree in the core language and reduces its expression to XML values that is the result of the query

Dynamic Semantics

Evaluation takes an expression and returns a value

Expr Value

Value ::= Boolean | Integer Boolean ::= fn: true() | fn: false() Integer ::= 0 | 1 | -1 | 2 | -2 |…

Dynamic Semantics (cont’d)

Expr ::= Value

| Expr < Expr

| Expr + Expr

| if (Expr) then Expr else Expr

e.g. 5 < 10, 1 + 2, if(1 < 2) then 4 + 5 else 6 + 7

Dynamic Semantics (cont’d)

Evaluation is described by five rules:

i. Value Value (VALUE)

ii. Expr0 Integer0 Expr1 Integer1 Expr0 < Expr1 Integer0 < Integer1 (LT)

iii. Expr0 Integer0 Expr1 Integer1 Expr0 + Expr1 Integer0 + Integer1 (SUM)

Dynamic Semantics (cont’d)

iv. Expr0 fn: true()Expr1 Valueif (Expr0) then Expr1 else Expr2 Value

(IF-TRUE)

iv. Expr0 fn: true()Expr1 Valueif (Expr0) then Expr1 else Expr2 Value

(IF-FALSE)

Dynamic Semantics (cont’d)

Example: Build a proof tree to evaluate the expression 1+2

1 1 (VALUE)

2 2 (VALUE)

1 + 2 3 (SUM)

Environment

dynEnv├ Expr Value

dynEnv dynamic Environment An environment may have many components varValue a map from variables to their

values Binding a variable to an environment overrides

any previous bindings of the variable

Environment (cont’d)

Notation Meaning

The initial environment with an empty map

dynEnv.varValue (Var1 Value1,…, Varn Valuen)

The environment that maps Vari to Valuei

dynEnv + varValue (Var Value)

The environment identical to dynEnv except that maps Var to Value

dynEnv.varValue (Var) The value of var in dynEnv

dom(dynEnv.varValue)The set of variables mapped in dynEnv

Environment (cont’d)

Expr ::= …previous expressions…

| $Var

| let $Var := Expr return Expr

e.g. $x, let $x := 1 return $x + 2

Environment (cont’d)

The five rules shown before need to be revised, e.g. (LT)

ii. dynEnv├ Expr0 Integer0 dynEnv├ Expr1 Integer1 dynEnv├ Expr0 < Expr1 Integer0 < Integer1

two more rules are added dynEnv.varValue (Var) = Value

dynEnv├ $Var = Value(VAR)

dynEnv├ Expr0 Value0

dynEnv + varValue (Var Value0) ├ Expr1 Value1

dynEnv├ let $Var := Expr0 return Expr1 Value1 (LET)

Matching Values and Types The value must match the variables type, else an exception is

raised

Expr ::= …previous expressions… | let $Var as Type := Expr return Expr

Type ::= xs: boolean | xs: integer

Static type declarations when the expression is analyzed

Dynamic type declarations when the expression is evaluated

Value matches Type

Matching Values and Types (cont’d) Three new rules derive

viii. Integer matches xs: Integer (INT-MATCH)

ix. Boolean matches xs: Boolean (BOOL-MATCH)

x. dynEnv├ Expr0 Value0

Value0 matches TypedynEnv + varValue (Var Value0) ├ Expr1 Value1

dynEnv├ let $Var as Type := Expr0 return Expr1 Value1

(LET-DECL)

Errors

dynEnv├ Expr raises Error

Error ::= typeErr | dynErr

Expr ::= …previous expressions… | Expr idiv Expr

Errors (cont’d)

Type errors triggered if an operand’s value does not match the operator’s required type

not (Value matches Type)

dynEnv├ Expr0 Value0

not (Value0 matches Type)

dynEnv├ let $Var as Type := Expr0 return Expr1 raises typeErr

Errors (cont’d) - Dynamic errors

dynEnv├ Expr0 Value0

dynEnv├ Expr1 Value1

Value1 ≠ 0dynEnv├ Expr0 idiv Expr1

Value0 idiv Value1

dynEnv├ Expr1 0 dynEnv├ Expr0 idiv Expr1 raises dynErr

Errors (cont’d)

Example: what errors does the following expression raise?

(1 idiv 0) + (2 < 3)

typeErrtypeErrdynErrdynErr

Static Semantics How static types associated with expressions

Static typing takes a static environment and an expression and returns a type statEnv ├ Expr : Type

statEnv the static environment that captures the context available at query-analysis time (variables and their types)

No need to check for type errors at evaluation time

Static Semantics (cont’d) Two rules to assign type

statEnv├ Boolean : xs: boolean (BOOLEAN-STATIC)

statEnv├ Integer : xs: integer (INTEGER-STATIC)

statEnv├ Expr0 : xs: booleanstatEnv├ Expr1 : TypestatEnv├ Expr2 : TypestatEnv├ if (Expr0) then Expr1 else Expr2 : Type(IF-STATIC)

We do not examine the value of the Expr – the value is not known statically

Examine only the type of the condition must be boolean The branches must have the same type

Static Semantics (cont’d)

Expression: if (1 < 3) then 3 + 4 else 5 + 6

statEnv ⊢ 1 : integer

statEnv ⊢ 3 : integer (BOOLEAN-STATIC) statEnv ⊢ 1 < 3 : boolean

(INTEGER-STATIC)

statEnv ⊢ 3 + 4 : integer

(IF-STATIC)

statEnv ⊢ if (1 < 3) then 3 + 4 else 5 + 6 : integer

statEnv ⊢ 3 : integer

statEnv ⊢ 4 : integer

(INTEGER-STATIC)

statEnv ⊢ 5 + 6 : integer

statEnv ⊢ 5 : integer

statEnv ⊢ 6 : integer

Type Soundness

Suppose Expr : Type Expr either yields a value of the same type or raises a dynamic error

dynEnv matches statEnv capture the relationship between dynEnv and statEnv dynEnv1 := varValue (x 1, y 0, z fn: false()) statEnv1 := varType (x: xs: integer, y: xs: integer,

z: xs: boolean)

Type Soundness (cont’d)

Theorem for Values

if dynEnv matches statEnv

dynEnv ├ Expr Value

statEnv ├ Expr Type

then

Value matches type

Type Soundness (cont’d)

Example

dynEnv1 matches statEnv1

dynEnv1├ if ($z) then $x else $y 0

statEnv1├ if ($z) then $x else $y : xs: integer

0 matches xs: integer

Type Soundness (cont’d)

Theorem for Errors

if

dynEnv matches statEnv

dynEnv ├ Expr raises Error

statEnv ├ Expr : Type

then

Error ≠ typeErr

Type Soundness (cont’d)

Example

dynEnv1 matches statEnv1

dynEnv1├ $x idiv $y raises dynErr

statEnv1├ $x idiv $y : xs: integer

Type Soundness (cont’d)

Remember that If an expression raises a type error, then it cannot

type check e.g. dynEnv1├ $x + $z raises typeErr

statEnv1├ $x + $y : Type

An expression that does not raise a type error may still fail to statically type dynEnv1├ if ($x < $y) then $x + $z else $y 0

statEnv1├ if ($x < $y) then $x + $z else $y : Type

Evaluation Order

Test the expressions in either order Expr and Expr

Stop and return false if either one is false Raise an error if either one raises an error

Example (1 idiv 0 < 2) and (4 < 3)

Two possible results: dynErr or false Depends on which will be evaluated first Both correct

11

Normalization takes an expression in full XQuery and returns an equivalent

expression in core XQuery [FullExpr]Expr == Expr

FullExpr ::= Expr | let $Var as Type := Expr where Expr

return Expr

[Expr0 + Expr1]Expr == [Expr0 ]Expr + [Expr0 ]Expr

[$Var ]Expr = $Var

[let $Var as Type := Expr0 where Expr1 return Expr2]Expr == let $Var as Type := [Expr0]Expr return if ([Expr1]Expr ) then [Expr2]Expr else ()

Outline

So far we have covered: Dynamic Semantics Environments Matching Values and Types Errors Static Semantics Type Soundness Evaluation order Normalization

Outline (cont’d)

Now we will talk in more depth about: Values and Types Matching and Subtyping FLWOR Expressions Path Expressions Implicit Coercion and Function calls Node Identity and Element Constructors

Part II: Values and Types Value sequence of one or more items

Value ::= () | Item (,Item)*

Item ::= AtomicValue | NodeValue

AtomicValue ::= xs: integer(String) | xs: boolean(String) | xs: string(String) | xs: date(String)

e.g. xs: string(“XQuery’)( XQuery), xs:boolean(“false”)( fn: false())

Values and Types (cont’d)

NodeValue ::= element ElementName TypeAnnotation? { Value }

| text { String }

ElementName ::= QName

TypeAnnotation ::= of type TypeName

TypeName ::= QName

Values and Types (cont’d)

ItemType ::= NodeType | AtomicType

NodeType ::= ElementType | text ()

AtomicType ::= AtomicTypeName

AtomicTypeName ::= xs:string | xs:integer | xs:boolean | xs:date

Values and Types (cont’d)

ElementType := element((ElementName (,TypeName)?)?)

element(article) global declaration element(article, xs:string) local declaration

Type ::= none() | empty() | ItemType | | Type , Type | Type | Type | | Type Occurrence

Occurrence ::= ? | + | *

Values and Types (cont’d)

SimpleType ::= AtomicTypeName | SimpleType | SimpleType

| SimpleType Occurrence

Definition ::= define element ElementName TypeAnnotation

| define type TypeName TypeDeriviation

TypeDeriviation ::= restricts AtomicTypeName | restricts TypeName { Type } | { Type }

Values and Types (cont’d)

Example

define element article of type Articledefine type Article {

element (name, xs; string),element (reserve_price, PriceList) *

}define type PriceList restricts xs:anyType { xs:decimal *}

Matching and Subtyping Matching relate complex XML values with

complex typese.g.<reserve_price> 10.00 20.00 25.00 </reserve_price>

element reserve_price of type PriceList {10.0, 20.0, 25.0}matches element (reserve_price)

Subtyping checks whether a type is a subtype of another

Matching and Subtyping (cont’d)

Yields

ElementType yields element (ElementName,TypeName)

ElementType Reference to a global element name of the element and type

annotation from the element declaration Contains an element name with a type annotation element name and

type name in the type annotation Has a wildcard name followed by type name wildcard name and type

name Has neither element name nor type name wildcard name and

xs:anyType

Matching and Subtyping (cont’d)

Substitutes for

ElementName1 substitutes for ElementName2 When the two names are equal When the second name is the *

An element name may substitute for itself statEnv├ ElementName substitutes for

ElementName

Matching and Subtyping (cont’d)

Derives

TypeName1 derives from TypeName2

e.g. PriceList derives from xs:anyType

Every type name derives derives from the type name that is declared to derive from by restriction

Reflexive and transitive

Matching and Subtyping (cont’d)

Matches

Value matches Type e.g. (10.0, 20.0, 25.0) matches xs:decimal *

The empty sequence matches the empty sequence, e.g. () matches ().

If two values match two types, then their sequence matches the corresponding sequence type.

Matching and Subtyping (cont’d)

Matches

If a value matches a type, then it also matches a choice type, where that type is one of the choices.Value matches Type1

Value matches Type1 | Type2

A value matches an optional occurrence of a type of it either matches the type or the empty sequenceValue matches empty() | TypeValue matches Type ?

Matching and Subtyping (cont’d)

Subtyping

Type1 subtype Type2

If and only ifValue matches Type1 Value matches Type2

e.g. element(*, PriceList) subtype element(xs:integer)

FLWOR Expressions

Expr ::= …previous expressions… | FLWRExpr FLWRExpr ::= Clause+ return Expr Clause ::= ForExpr | LetExpr | WhereExpr ForExpr ::= for ForBinding (, ForBinding) * WhereExpr ::= where Expr LetExpr ::= let LetBinding (, LetBinding) * ForBinding ::= $Var TypeDeclaration? PositionVar? in Expr LetBinding ::= $Var TypeDeclaration? := Expr TypeDeclaration ::= as SequenceType PositionVar ::= at $Var SequenceType ::= ItemType Occurrence

FLWOR Expressions (cont’d) Normalization

A for / let clause with more than one binding turns each binding into a separate nested for / let expression and normalizes the result, (n>1)

[let LetBinding1 … LetBindingn return Expr]Expr == [let LetBinding1 return … [ for LetBindingn return Expr]Expr]Expr

a where clause is normalized into an if expression that returns the empty sequence if the condition is false, and normalizes the result

[where Expr0 return Expr1]Expr == [if(Expr0) then Expr1 else ()]Expr

FLWOR Expressions (cont’d)

for $i in $I, $j in $Jlet $k := $i + $jwhere $k >= 5 return ($i , $j)

for $i in $I returnfor $j in $J return

let $k := $i + $jif ($k >= 5) then ($i,$j)

else ()

Normalization

FLWOR Expressions (cont’d)

Factored types consist of an item type and an occurrence indicator

Result type = Prime ∙ Quantifier

e.g. ((xs:integer, xs:string) | xs:integer) *

subtype (xs:integer | xs:string) *

prime ((xs:integer, xs:string) | xs:integer) * = xs:integer |

xs:stringquant ((xs:integer, xs:string) | xs:integer) * = *

FLWOR Expressions (cont’d)

Factorization theorem

for all types we have

Type subtype prime (Type) ∙ quant (Type)

further if

Type subtype Prime ∙ Quantifier

then

prime (Type) subtype Prime and

quant (Type) ≤ Quantifier

1 ≤ ?, 1 ≤ +, ? ≤ *, + ≤ *

Path Expressions

[QName]Path == child:: QName [book/isbn]Path == child:: book/child:: isbn

[.]Path == self:: node()

[..]Path == parent:: node()

[Expr1//Expr2]Path == [Expr1/descendant-or-self:: node()/Expr2]Path

Path Expressions (cont’d)

Expr ::= …previous expressions… | PathExpr PathExpr ::= / | / RelativePathExpr | RelativePathExpr RelativePathExpr ::= RelativePathExpr / StepExpr |

StepExpr | RelativePathExpr // StepExpr StepExpr ::= (ForwardStep | ReverseStep) Predicates ForwardStep ::= ForwardAxis NodeTest ReverseStep ::= ReverseAxis NodeTest ForwardAxis ::= child:: | descendant:: | self::

| descendant-or-self:: ReverseAxis ::= parent:: Predicates ::= ( [ Expr ] )* NodeTest ::= text() | node() | * | QName

Rule that relates normalization of expressions to normalization of path expressions:

[PathExpr]Expr == fs:distinct-docorder([PathExpr]Path)

Normalization of absolute path expressions[/]path == fn:root($fs:dot)

[/RelativePathExpr]path == [fn:root($fs:dot)/RelativePathExpr]path

Built-in variable $fs:dot represents the context node An absolute path expression refers to the root of the

XML tree that contains the context node

Path Expressions (cont’d)

Path Expressions (cont’d) Normalization of “/”

[RelativePathExpr / StepExpr]path ==

let $fs:sequence := fs:distinct-docorder([RelativePathExpr]path) return

let $fs:last := fn:count($fs:sequence) return

for $fs:dot at $fs:position in $fs:sequence return

[StepExpr]path

This rule binds the variables $fs:sequence, $fs:last, $fs:dot and $fs:position to, respectively, the context sequence, the context size, the context node and the position of that node in the context sequence

Path Expressions (cont’d) Normalization of step expressions:[ForwardStep Predicates [Expr]]Path == let $fs:sequence := [ForwardStep Predicates]Path return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return if ([Expr]Predicates) then $fs:dot else ()

Similar rule for ReverseStep but the $fs:position is bound reversely

Example (simplified): child::*[2] let $fs:sequence := child::* return

let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return if (fn:position() = 2) then $fs:dot else ()

Path Expressions (cont’d)

Predicate mapping

[Expr]Predicates ==

typeswitch([Expr]Expr)

case numeric $v return

op:numeric-equal(fn:round($v), $fs:position)

default $v return

fn:boolean($v)

Finally, axis mapping is straightforward

[ForwardAxis :: NodeTest]Path == ForwardAxis :: Nodetest

[ReverseAxis :: NodeTest]Path == ReverseAxis :: Nodetest

Path Expressions (cont’d) path expression $input//a/b is normalized to

fs:distinct-docorder( let $fs:sequence := ( fs:distinct-docorder( let $fs:sequence := $input return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return fs:distinct-docorder( let $fs:sequence := descendant-or-self::node()

return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence

return child::a)) ) return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return child::b)

Implicit Coercion and Function calls

XQuery can represent a Schema containing irregular data in the formal type notation

<xs: element name=“article” type=“Article”/><xs: complexType name=“Article”>

<xs: sequence><xs: element name=“name” type=“xs:

string”/>< xs: element name=“reserve_price”

type=“PriceList” minOccurs=“0” maxOccurs=“unbounded”/> </xs: sequence>

</xs: complexType> <xs: simpleType name=“PriceList”>

<xs: list itemType=“xs: decimal”/></xs: simpleType>

define element article of type Articledefine type Article {

element (name, xs: string),element (reserve_price, PriceList) * ,}

define type PriceList { xs: decimal * }

Implicit Coercion and Function calls (cont’d)

An arithmetic expression is well defined on any item sequence that can be coerced to zero or one atomic value $article/reserve_price a sequence of zero or more

reserve_price elements

A comparison is well defined on any item sequence that can be coerced to a sequence of atomic values $article/reserve_price < 100 the typed context of

$article/reserve_price is automatically extracted

XPath’s predicate expressions are well defined on any item sequence $article[reserve_price] returns each node in $article that

has at least one reserve_price child

Implicit Coercion and Function calls (cont’d)

First coercion Applied to expressions that require a boolean value Maps the Expr argument to a core expression and

applies fn: boolean to the result

[if (Expr0) then Expr1 else Expr2]Expr ==

if (fn: boolean([Expr0]Expr)) then [Expr1]Expr

else [Expr2]Expr

Implicit Coercion and Function calls (cont’d)

Second coercion Applied to an expression when used in a context

that requires a sequence of atomic values Maps the Expr argument to a core expression

then applies the fn: data to the result

fn: data takes any item sequence, applies the following rules and concatenates the results

•If the item is an atomic value, it is returned•Otherwise, the item is a node and its typed value is returned

Implicit Coercion and Function calls (cont’d) Normalization rule for +

[Expr1 + Expr2]Expr == let $v1 := fn: data([Expr1]Expr) return let $v2 := fn: data([Expr2]Expr) return fs: plus($v1, $v2)

Normalization rule for < [Expr1 < Expr2]Expr == some $v1 in fn: data([Expr1]Expr) satisfies some $v2 in fn: data([Expr2]Expr) satisfies fs: less-than($v1, $v2)

$article/reserve_price + 10.00

(<reserve_price/>) returns ()(<reserve_price>10.00 (</reserve_price>) returns 20.00(<reserve_price>10.00 (</reserve_price>), <reserve_price>20.00 25.00(</reserve_price>) type error, because the

atomized value is a sequence of decimals

Node Identity and Element Constructors

Element Constructor creates new nodes with new identities e.g let $name := <name> Red Bicycle </name>

return <article>{$name, $name}</article>

Store mapping from node identifiers to node values Item ::= NodeId | AtomicValue dynEnv store(NodeId NodeValue)

Node Identity and Element Constructors (cont’d)

<article>

<name>Red Bicycle</name>

<start_date>1999-01-05

</start_date>

<end_date>1999-02-20

</end_date>

<reserve_price>40

</reserve_price>

Store(

N1 element article of type Article {N2, N3, N4, N5},

N2 element article of type xs:string {“Red Bicycle”},

N3 element article of type xs:date {“1999-01-05”},

N4 element article of type xs:date {“1999-02-20”},

N5 element article of type xs:decimal {“40”} )

Node Identity and Element Constructors (cont’d)

Evaluation affects the store Store0 ; dynEnv├ Expr Value ; Store1

Each new store computed through a given judgment is passed as input to the next judgment

Most rules treat the store implicitly

dynEnv├ Expr0 Value0

dynEnv + varValue(Var Value0) ├ Expr1 Value1

dynEnv├ let $Var := Expr0 return Expr1 Value1

; Store2

; Store2

; Store1

Store1 ;

Store0 ;

Store0 ;

Node Identity and Element Constructors (cont’d)

Validation

Evaluate the expression to yield a value Erase all type information in the value Construct the untyped element node Validate the node to yield the final typed value

SchemaUntyped

document

all nodes have an associated

type annotation

Node Identity and Element Constructors (cont’d)

Static semantics and element construction

The static type system performs a conservative analysis that catches errors early, during static analysis rather than dynamic evaluation. e.g. if the element is declared to have type xs:

integer then the type of its contents must be xs: integer