Painless OO XML with XML::Pastor

58
Painless OO <-> XML with XML::Pastor Joel Bernstein - LPW 2008

description

An introduction to XML::Pastor, comparison with other modules etc

Transcript of Painless OO XML with XML::Pastor

Page 1: Painless OO XML with XML::Pastor

Painless OO <-> XMLwith XML::Pastor

Joel Bernstein - LPW 2008

Page 2: Painless OO XML with XML::Pastor

It’s all Greek to me

schema (pl. schemata)σχήμα (skhēma)shape, plan

Page 3: Painless OO XML with XML::Pastor

I do not like XMLPeople use it wrong

• Apple Property Lists

• Tag soup

• Data transfer format vs data storage format

Page 4: Painless OO XML with XML::Pastor

How many of you?

• Use XML

• Hate XML

• Like XML

Page 5: Painless OO XML with XML::Pastor

Do you write XML

• By hand?

• Programmatically?

• Schemata?

• Validation?

• Transformation?

Page 6: Painless OO XML with XML::Pastor

XML::Pastor is forall of you.

Page 7: Painless OO XML with XML::Pastor

XML is hard, right?Some hard things:

• Roundtripping data

• Manipulating XML via DOM API

• Preserving element sibling order, comments, XML entities etc.

Page 8: Painless OO XML with XML::Pastor

SolutionTools should make both the syntax and the details of

the manipulation of XML invisible

Page 9: Painless OO XML with XML::Pastor

XML::Pastor

• I didn’t write it

• Written by Ayhan Ulusoy

• Available on CPAN

• Abstracts away some of the pain of XML

Page 10: Painless OO XML with XML::Pastor

What does it do?

• Generates Perl code from W3C XML Schema (XSD)

• Roundtrip and validate XML to/from Perl without loss of schema information

• Lets you program without caring about XML structure

Page 11: Painless OO XML with XML::Pastor

Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM object

• Convert XML DOM tree into native Perl objects

• Throw away DOM, no longer needed

Page 12: Painless OO XML with XML::Pastor

Reasons to not use XML::Pastor

• When you have no XML Schema

• Although several tools can infer XML schemata from documents

• It’s a code-generator

• No stream parsing

Page 13: Painless OO XML with XML::Pastor

XML::Pastor Code Generation

• Write out static code to tree of .pm files

• Write out static code to single .pm file

• Create code in a scalar in memory

• Create code and eval() it for use

Page 14: Painless OO XML with XML::Pastor

Warning, boring bit

Page 15: Painless OO XML with XML::Pastor

How Pastor worksCode generation

• Parse schemata into schema model

• Perl data structures containing all the global elements, types, attributes, ...

• “Resolve” Model - determine class names, resolve references, etc

• Create boilerplate code, write out / eval

Page 16: Painless OO XML with XML::Pastor

How Pastor worksCode Generation pt. 2

Page 17: Painless OO XML with XML::Pastor

How Pastor worksGenerated classes

• Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model

• If the class isa SimpleType it may contain restriction facets

• If the class isa ComplexType it will contain info about child elements and attributes

Page 18: Painless OO XML with XML::Pastor

How Pastor worksIn use

• If classes generated offline, then “use” them, if online then they are already loaded

• These classes have methods to create, retrieve, save object to/from XML

• Manipulate/query data using OO API to complexType fields

• Validate modified objects against schema

Page 19: Painless OO XML with XML::Pastor

Very simple Album XML demo

Page 20: Painless OO XML with XML::Pastor

Album XML document

Page 21: Painless OO XML with XML::Pastor

Album XML schema

Page 22: Painless OO XML with XML::Pastor

Pastorize creates Perl classes from Album XML schema:

Resulting code tree like:

Page 23: Painless OO XML with XML::Pastor

Roundtrip and modify XML data using Pastor:

Page 24: Painless OO XML with XML::Pastor

The result!

Page 25: Painless OO XML with XML::Pastor

Real world Pastor

Page 26: Painless OO XML with XML::Pastor

Moose::Role for Pastor

Page 27: Painless OO XML with XML::Pastor

Country XML

Page 28: Painless OO XML with XML::Pastor

Dynamic XML::Pastor usage

Page 29: Painless OO XML with XML::Pastor

Query the Country object

Page 30: Painless OO XML with XML::Pastor

Modify elements and attributes with uniform syntax

Page 31: Painless OO XML with XML::Pastor

NodeArray syntax

Page 32: Painless OO XML with XML::Pastor

Create new City data and combine with existing Country object

Page 33: Painless OO XML with XML::Pastor

Validate modified data against the stored schema

Page 34: Painless OO XML with XML::Pastor

Turn Pastor objects back into XML, or transform to XML::LibXML DOM

Page 35: Painless OO XML with XML::Pastor

Simple D::HA object

Page 36: Painless OO XML with XML::Pastor

Rekeying data

Page 37: Painless OO XML with XML::Pastor

Rekeying data deeper

Page 38: Painless OO XML with XML::Pastor

XML::Pastor Scope

• Good for “data XML”

• Unsuitable for “mixed markup”

• e.g. XHTML

• Unsuitable for “huge” documents

Page 39: Painless OO XML with XML::Pastor

XML::Pastor Supported XML Schema Features• Simple and Complex Types• Global Elements• Groups, Attributes, AttributeGroups• Derive simpleTypes by extension• Derive complexTypes by restriction• W3C built-in Types, Unions, Lists• (Most) Restriction Facets for Simple types• External Schema import, include, redefine

Page 40: Painless OO XML with XML::Pastor

XML::Pastorknown limitations

• Mixed elements unsupported

• Substitution groups unsupported

• ‘any’ and ‘anyAttribute’ elements unsupported

• Encodings (only UTF-8 officially supported)

• Default values for attributes - help needed

Page 41: Painless OO XML with XML::Pastor

XML Data Binding

• Binding XML documents to objects specifically designed for the data in those documents

• Allows e.g. data-centric applications to manipulate data more naturally than by using DOM API

Page 42: Painless OO XML with XML::Pastor

Sales Order XML

Page 43: Painless OO XML with XML::Pastor

Sales Order XML Logical data model

XML DOM

Page 44: Painless OO XML with XML::Pastor

XML DOM

Page 45: Painless OO XML with XML::Pastor

How this makes me feel:

Page 46: Painless OO XML with XML::Pastor

Other XML modules• XML::Twig

• XML::Compile

• XML::Simple

• XML::Smart

Page 47: Painless OO XML with XML::Pastor

XML::Twig

• Manipulates XML directly

• Using code is coupled closely to document structure

• Optimised for processing huge documents as trees

• No schemata, no validation

Page 48: Painless OO XML with XML::Pastor

XML::Compile

• Original design rationale is to deal with SOAP envelopes and WSDL documents

• Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures

• More like XML::Simple with Schema support

Page 49: Painless OO XML with XML::Pastor

XML::Compile pt. 2

• Schema support incomplete

• Shaky support for imports, includes

• Include restriction on targetNamespace

• I haven’t used it yet but it looks good

Page 50: Painless OO XML with XML::Pastor

XML::Simple

• Working roundtrip binding for simple cases

• e.g. XMLout(XMLin($file)) works

• Simple API

• Produces single deep data structure

• Gotchas with element multiplicity

Page 51: Painless OO XML with XML::Pastor

XML::Simple pt. 2

• No schemata, no validation

• Can be teamed with a SAX parser

• More suitable for configuration files?

Page 52: Painless OO XML with XML::Pastor

XML::Smart

• Similar implementation to XML::Pastor

• Uses tie() and lots of crac^H^H^H^Hmagic

• Gathers structure information from XML instance, rather than schema

• No code generation!

Page 53: Painless OO XML with XML::Pastor

XML::Smart pt. 2

• No schemata, so no schema validation

• Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB

• Like Pastor, overloads array/hashref access to the data - promotes decoupling

• Reasonable docs, some community growing

Page 54: Painless OO XML with XML::Pastor

Any questions?

Page 55: Painless OO XML with XML::Pastor

Thanks for comingSee you next year

Page 56: Painless OO XML with XML::Pastor

Bonus MaterialIf we have enough time

Page 57: Painless OO XML with XML::Pastor

XML Schema Inference

• Create an XML schema from an XML document instance

• Every document has an (implicit) schema

• Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances

Page 58: Painless OO XML with XML::Pastor

Schema diff