Post on 23-Jun-2015
description
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Nadejda Alkhaldi and Christophe Debruyne
16/10/11 1 Herhaling titel van presentatie
Pag.
Introduction
16/10/11 2
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Introduction
Ontologies are a [formal,] explicit specification of a [shared] conceptualization (Gruber)
Autonomously developed and maintained information systems commit to the ontology, a mostly manual activity.
How can we automate (a part) of this process?
16/10/11 3
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: overview
First we need an ontology. – We used the DOGMA method for ontology engineering – The development of the ontology is reported elsewhere in
Debruyne et al. (WEBIST 2011)
Semi-automatically annotate the data – Match concept in the (structure of) the data to the ontology – Generate a Ω-RIDL commitment file – Review of the mappings by representative of the information
system
16/10/11 4
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: DOGMA
DOGMA Ontology Descriptions <Λ, ci, K>
– Λ a lexon base, a finite set of plausible binary fact types called lexons <γ, t1, r1, r2, t2> <Vendor Community, Offer, has, is of, Title>
– ci a partial function mapping context-identifiers and terms to concepts
– K a finite set of ontological commitments containing – A selection of lexons – A mapping from application symbols to ontology terms – Predicates over those terms and roles to express constraints
16/10/11 5
γ in Γ Context-identifiers, pointers to a community
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag. 16/10/11 6
Method: DOGMA
Example of a commitment
Ω-RIDL: Verheyden et al. (SWDB 2004), Trog et al. (RuleML 2007)
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
First … related work? – Annotation Techniques:
AeroDAML, SHOE Knowledge Annotator, S-CREAM, MnM, Armadillo, KIM, SemTag, Ontea.
– Ontology and schema matching techniques: CUPID, iMAP, oMAP, H-Match
– Looking at different aspect and reusing ideas that might be usable
16/10/11 7
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
16/10/11 8
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
Some considerations – Ontology contains explicit relations between concepts, the XML
not – XML tags can be matched concepts of the ontology, but the
content of a tag can also represent an a concept E.g., <facility type=“bar”> should be typed onto the concept of Bar and not onto Facility of which Bar is a subtype.
– No XML Schema to rely on! – Spelling mistakes/language variations
16/10/11 9
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
1) Element match – Match tag and attribute names using string metrics
2) Linguistic match – Match tag and attribute names using an external thesaurs (e.g.,
WordNet or a domain specific thesuarus)
3) Content match – Match the content of a tag (with respect to the tag) to identify
the concept represented by the content
16/10/11 10
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
4) Structural Match – Adjust the previously computed weighted means by looking to
the structure of both the ontology graph and XML-tree.
16/10/11 11
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Method: (Semi)-Automatic Annotation
To summarize: – using an XML and a DOGMA ontology – a series of mapping scores are calculated based on element,
linguistic and content match – Those scores are then refined using the structural match – The refined scores are then compared against a threshold to
produce the Ω-RIDL mappings.
16/10/11 12
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
– using an XML and a DOGMA ontology – a series of mapping scores are calculated based on element,
linguistic and content match – Those scores are then refined using the structural match – The refined scores are then compared against a threshold to
produce the Ω-RIDL mappings.
– The user can then use the generated mappings to get an idea how his application can commit to the ontology and then decide how to do so.
Method: Summary
16/10/11 13
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Tool
16/10/11 14
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Experiment
Data of the COMDRIVE RFP project – Holiday Packages in the winter sports domain
16/10/11 15
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Experiment
Data of the COMDRIVE RFP project – Holiday Packages in the winter sports domain
Ontology developed in several iterations in the project – Bootstrapping of the ontology – Meeting with vendor experts – Meeting with consumer experts
16/10/11 16
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Experiment
Data of the COMDRIVE RFP project – Holiday Packages in the winter sports domain
Ontology developed in several iterations in the project – Bootstrapping of the ontology – Meeting with vendor experts – Meeting with consumer experts
16/10/11 17
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Experiment
Some generated mappings – map ‘‘/countries/country/sumary/code’’ on
Code identifies / identified by Commodity. – map ‘‘/countries/country/regions/region’’ on Region. – map ‘‘/countries/country/regions/region’’ on
Ski Area destination of / with destination Holiday Package. – map ‘‘/countries/country/regions/region/cities/city’’ City. – …
16/10/11 18
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Conclusions
The four heuristics were able to tackle the considerations mentioned.
The algorithm depends on a good choice of parameters, otherwise a lot of “nonsense” mappings are generated
The structural match needs to be revisited to cope with more complicated cases such as: – map ‘‘/countries/country/regions/region/summary/description’’
on Description of / has RFP.
Appropriate for suggesting the user mappings (needs testing)
16/10/11 19
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Future work
Revision of the structural match
Integration with tool suite (e.g., Business Semantics Studio)
Additional testing
16/10/11 20
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations
Pag.
Questions?
16/10/11 21
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations