Painless OO XML with XML::Pastor - 2009 Remix

69
Painless OO <-> XML with XML::Pastor (2009 remix) Joel Bernstein YAPC::EU 2009

description

How to build Perl classes with roundtrip data binding to XML, painlessly, using W3C XML Schema and XML::PastorSlides from a previous revision of this talk are online at:http://www.slideshare.net/joelbernstein/painless-oo-xml-with-xmlpastorq-presentation/I will be presenting an expanded, more practical, 2009 version of this talk. Now with more code and less theory!- XML is hard, right? Some things which are hard.- XML data binding- Comparisons of modules- XML::Twig- XML::Smart- XML::Simple- XML::Pastor- Pastor howto- XML schema inference- Trang, Relaxer- Relaxer howto- The future?For more information on XML::Pastor see:http://search.cpan.org/~aulusoy/XML-Pastor/Relaxer download:http://www.relaxer.jp/download/relaxer-1.0.zipRelaxer book (Japanese...):http://www.amazon.co.jp/exec/obidos/ASIN/4894715279/Trang:http://www.thaiopensource.com/download/trang-20030619.zip

Transcript of Painless OO XML with XML::Pastor - 2009 Remix

Page 1: Painless OO XML with XML::Pastor - 2009 Remix

Painless OO <-> XMLwith XML::Pastor

(2009 remix)

Joel BernsteinYAPC::EU 2009

Page 2: Painless OO XML with XML::Pastor - 2009 Remix

It’s all Greek to me

schemaσχήµα (skhēma)shape, plan

Page 3: Painless OO XML with XML::Pastor - 2009 Remix

How many of you?

Page 4: Painless OO XML with XML::Pastor - 2009 Remix

How many of you?

• Use XML

Page 5: Painless OO XML with XML::Pastor - 2009 Remix

How many of you?

• Use XML

• Hate XML

Page 6: Painless OO XML with XML::Pastor - 2009 Remix

How many of you?

• Use XML

• Hate XML

• Like XML

Page 7: Painless OO XML with XML::Pastor - 2009 Remix

A Confession

• I do not like XML

• People use it wrong

Page 8: Painless OO XML with XML::Pastor - 2009 Remix

XML Data Binding

• Binding XML documents to objects specifically designed for the data in those documents.

• I often have to do this.

Page 9: Painless OO XML with XML::Pastor - 2009 Remix

XML is hard, right?Some hard things:

• Roundtripping data

• Manipulating XML via DOM API

• Preserving element sibling order, comments, XML entities etc.

Page 10: Painless OO XML with XML::Pastor - 2009 Remix

Typical horrendous XML document

Page 11: Painless OO XML with XML::Pastor - 2009 Remix

Sales Order XML Logical data model

XML DOM

Page 12: Painless OO XML with XML::Pastor - 2009 Remix

I shouldn’t need to care about this

Page 13: Painless OO XML with XML::Pastor - 2009 Remix

How this makes me feel:

Page 14: Painless OO XML with XML::Pastor - 2009 Remix

Fundamental problem

• I don’t think in elements and attributes

• I think about my data, not how it’s stored

• This is Perl. DWIM.

Page 15: Painless OO XML with XML::Pastor - 2009 Remix

SolutionTools should make both the syntax and the details of

the manipulation of XML invisible

Page 16: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

Page 17: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

• By hand?

Page 18: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

• By hand?

• Programmatically?

Page 19: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

• By hand?

• Programmatically?

• Schemata?

Page 20: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

• By hand?

• Programmatically?

• Schemata?

• Validation?

Page 21: Painless OO XML with XML::Pastor - 2009 Remix

Do you write XML

• By hand?

• Programmatically?

• Schemata?

• Validation?

• Transformation?

Page 22: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastor is forall of you.

Page 23: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastor

• Available on CPAN

• Abstracts away some of the pain of XML

• Ayhan Ulusoy is the author

• I am just a user

Page 24: Painless OO XML with XML::Pastor - 2009 Remix

What does it do?

• Generates Perl code from W3C XML Schema (XSD)

• Roundtrip and validate XML to/from Perl without loss of schema information

• Lets you program without caring about XML structure

Page 25: Painless OO XML with XML::Pastor - 2009 Remix

pastorize

• Automates codegen process

• Conceptually similar to DBIC::Schema::Loader

• TMTOWTDI - offline or runtime

• Works on multiple XSDs (caveat, collisions)

Page 26: Painless OO XML with XML::Pastor - 2009 Remix

pastorize in usepastorize --mode offline --style multiple \

--destination /tmp/lib/perl \--class_prefix MyApp::Data \/some/path/to/schema.xsd

Page 27: Painless OO XML with XML::Pastor - 2009 Remix

Very simple contrived Album XML demo

Page 28: Painless OO XML with XML::Pastor - 2009 Remix

Album XML document

Page 29: Painless OO XML with XML::Pastor - 2009 Remix

Album XML schema

Page 30: Painless OO XML with XML::Pastor - 2009 Remix

Pastorize the Album XML schema:

Resulting code tree like:

Page 31: Painless OO XML with XML::Pastor - 2009 Remix

Modify some XML

Page 32: Painless OO XML with XML::Pastor - 2009 Remix

Roundtrip and modify XML data using Pastor:

# Load XML# Accessors

# Modify

# Write XML

Page 33: Painless OO XML with XML::Pastor - 2009 Remix

The result!

Page 34: Painless OO XML with XML::Pastor - 2009 Remix

Real world Pastor

Page 35: Painless OO XML with XML::Pastor - 2009 Remix

Real world Pastor

$HASH1 = { 1 => 'Vodafone UK', 2 => 'O2 UK', 3 => 'Orange UK', 4 => 'T-Mobile UK', 8 => 'Hutchinson 3 UK'};

Page 36: Painless OO XML with XML::Pastor - 2009 Remix

Country XML

Page 37: Painless OO XML with XML::Pastor - 2009 Remix

Dynamic schema parsing of Country XML

Page 38: Painless OO XML with XML::Pastor - 2009 Remix

Query the Country object

Page 39: Painless OO XML with XML::Pastor - 2009 Remix

Modify elements and attributes with uniform syntax

Page 40: Painless OO XML with XML::Pastor - 2009 Remix

Manipulate array-like data

Page 41: Painless OO XML with XML::Pastor - 2009 Remix

Create new City data and combine with existing Country object

Page 42: Painless OO XML with XML::Pastor - 2009 Remix

Validate modified data against the stored schema

Page 43: Painless OO XML with XML::Pastor - 2009 Remix

Turn Pastor objects back into XML, or transform to XML::LibXML DOM

Page 44: Painless OO XML with XML::Pastor - 2009 Remix

Parsing with Pastor

• Parse entire XML into XML::LibXML::DOM object

• Convert XML DOM tree into native Perl objects

• Throw away DOM, no longer needed

Page 45: Painless OO XML with XML::Pastor - 2009 Remix

Reasons to not use XML::Pastor

• When you have no XML Schema

• Although several tools can infer XML schemata from documents

• It’s a code-generator

• No stream parsing

Page 46: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastor Scope

• Good for “data XML”

• Unsuitable for “mixed markup”

• e.g. XHTML

• Unsuitable for “huge” documents

Page 47: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastorknown limitations

• Mixed elements unsupported

• Substitution groups unsupported

• ‘any’ and ‘anyAttribute’ elements unsupported

• Encodings (only UTF-8 officially supported)

• Default values for attributes - help needed

Page 48: Painless OO XML with XML::Pastor - 2009 Remix

Other XML modules• XML::Twig

• XML::Compile

• XML::Simple

• XML::Smart

Page 49: Painless OO XML with XML::Pastor - 2009 Remix

XML::Twig

• Manipulates XML directly

• Using code is coupled closely to document structure

• Optimised for processing huge documents as trees

• No schemata, no validation

Page 50: Painless OO XML with XML::Pastor - 2009 Remix

XML::Compile

• Original design rationale is to deal with SOAP envelopes and WSDL documents

• Different approach but similar goals to Pastor - processes XML based on XSD into Perl data structures

• More like XML::Simple with Schema support

Page 51: Painless OO XML with XML::Pastor - 2009 Remix

XML::Compile pt. 2

• Schema support incomplete

• Shaky support for imports, includes

• Include restriction on targetNamespace

• I haven’t used it yet but it looks good

Page 52: Painless OO XML with XML::Pastor - 2009 Remix

XML::Simple

• Working roundtrip binding for simple cases

• e.g. XMLout(XMLin($file)) works

• Simple API

• Produces single deep data structure

• Gotchas with element multiplicity

Page 53: Painless OO XML with XML::Pastor - 2009 Remix

XML::Simple pt. 2

• No schemata, no validation

• Can be teamed with a SAX parser

• More suitable for configuration files?

Page 54: Painless OO XML with XML::Pastor - 2009 Remix

XML::Smart

• Similar implementation to XML::Pastor

• Uses tie() and lots of crac^H^H^H^Hmagic

• Gathers structure information from XML instance, rather than schema

• No code generation!

Page 55: Painless OO XML with XML::Pastor - 2009 Remix

XML::Smart pt. 2

• No schemata, so no schema validation

• Based on Object::MultiType - overloaded objects as HASH, ARRAY, SCALAR, CODE & GLOB

• Like Pastor, overloads array/hashref access to the data - promotes decoupling

• Reasonable docs, some community growing

Page 56: Painless OO XML with XML::Pastor - 2009 Remix

Any questions?

Page 57: Painless OO XML with XML::Pastor - 2009 Remix

Thanks for comingSee you next year

http://search.cpan.org/dist/XML-Pastor/

Page 58: Painless OO XML with XML::Pastor - 2009 Remix

Bonus MaterialIf we have enough time

Page 59: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastor Supported XML Schema Features• Simple and Complex Types• Global Elements• Groups, Attributes, AttributeGroups• Derive simpleTypes by extension• Derive complexTypes by restriction• W3C built-in Types, Unions, Lists• (Most) Restriction Facets for Simple types• External Schema import, include, redefine

Page 60: Painless OO XML with XML::Pastor - 2009 Remix

XML Schema Inference

• Create an XML schema from an XML document instance

• Every document has an (implicit) schema

• Tools like Relaxer, Trang, as well as the System.Xml.Serializer the .NET Framework can all infer XML Schemata from document instances

Page 61: Painless OO XML with XML::Pastor - 2009 Remix

Simple D::HA object

Page 62: Painless OO XML with XML::Pastor - 2009 Remix

Rekeying data

Page 63: Painless OO XML with XML::Pastor - 2009 Remix

Rekeying data deeper

Page 64: Painless OO XML with XML::Pastor - 2009 Remix

Warning, boring bit

Page 65: Painless OO XML with XML::Pastor - 2009 Remix

XML::Pastor Code Generation

• Write out static code to tree of .pm files

• Write out static code to single .pm file

• Create code in a scalar in memory

• Create code and eval() it for use

Page 66: Painless OO XML with XML::Pastor - 2009 Remix

How Pastor worksCode generation

• Parse schemata into schema model

• Perl data structures containing all the global elements, types, attributes, ...

• “Resolve” Model - determine class names, resolve references, etc

• Create boilerplate code, write out / eval

Page 67: Painless OO XML with XML::Pastor - 2009 Remix

How Pastor worksGenerated classes

• Each generated class (i.e. type) has classdata “XmlSchemaType” containing schema model

• If the class isa SimpleType it may contain restriction facets

• If the class isa ComplexType it will contain info about child elements and attributes

Page 68: Painless OO XML with XML::Pastor - 2009 Remix

How Pastor worksIn use

• If classes generated offline, then “use” them, if online then they are already loaded

• These classes have methods to create, retrieve, save object to/from XML

• Manipulate/query data using OO API to complexType fields

• Validate modified objects against schema

Page 69: Painless OO XML with XML::Pastor - 2009 Remix

Thanks for comingSee you next year

http://search.cpan.org/dist/XML-Pastor/