Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

21
Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations Nadejda Alkhaldi and Christophe Debruyne 16/10/11 1 Herhaling titel van presentatie

description

Presentation of Alkhaldi, N., Debruyne, C. (2011) Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations. In Proc. of On the Move to Meaningful Internet Systems 2011: OTM Workshops - Semantic and Decision Support (SeDeS 2011), LNCS, Springer - October 2011 Abstract: To facilitate the process of annotating data in the DOGMA ontology-engineering framework, we present a method and tool for semi-automatic annotation of XML data using an ontology. XML elements are compared against concepts and their interrelations in the ontology using various metrics at different levels (lexical level, semantic level, structural level, etc.). The result of these metrics are then used to propose the user a series of annotations from XML elements to concepts in the ontology, which are then validated by that user. Those annotations - expressed in Ω-RIDL - are then used to transform data from one format into another format. In this paper, we demonstrate our approach on XML data containing vendor offers in the tourism domain, more precisely holiday packages.

Transcript of Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Page 1: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Nadejda Alkhaldi and Christophe Debruyne

16/10/11 1 Herhaling titel van presentatie

Page 2: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Introduction

16/10/11 2

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 3: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Introduction

  Ontologies are a [formal,] explicit specification of a [shared] conceptualization (Gruber)

  Autonomously developed and maintained information systems commit to the ontology, a mostly manual activity.

  How can we automate (a part) of this process?

16/10/11 3

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 4: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: overview

  First we need an ontology. –  We used the DOGMA method for ontology engineering –  The development of the ontology is reported elsewhere in

Debruyne et al. (WEBIST 2011)

  Semi-automatically annotate the data –  Match concept in the (structure of) the data to the ontology –  Generate a Ω-RIDL commitment file –  Review of the mappings by representative of the information

system

16/10/11 4

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 5: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: DOGMA

  DOGMA Ontology Descriptions <Λ, ci, K>

–  Λ a lexon base, a finite set of plausible binary fact types called lexons <γ, t1, r1, r2, t2> <Vendor Community, Offer, has, is of, Title>

–  ci a partial function mapping context-identifiers and terms to concepts

–  K a finite set of ontological commitments containing – A selection of lexons – A mapping from application symbols to ontology terms – Predicates over those terms and roles to express constraints

16/10/11 5

γ in Γ Context-identifiers, pointers to a community

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 6: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag. 16/10/11 6

Method: DOGMA

  Example of a commitment

Ω-RIDL: Verheyden et al. (SWDB 2004), Trog et al. (RuleML 2007)

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 7: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

  First … related work? –  Annotation Techniques:

AeroDAML, SHOE Knowledge Annotator, S-CREAM, MnM, Armadillo, KIM, SemTag, Ontea.

–  Ontology and schema matching techniques: CUPID, iMAP, oMAP, H-Match

–  Looking at different aspect and reusing ideas that might be usable

16/10/11 7

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 8: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

16/10/11 8

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 9: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

  Some considerations –  Ontology contains explicit relations between concepts, the XML

not –  XML tags can be matched concepts of the ontology, but the

content of a tag can also represent an a concept E.g., <facility type=“bar”> should be typed onto the concept of Bar and not onto Facility of which Bar is a subtype.

–  No XML Schema to rely on! –  Spelling mistakes/language variations

16/10/11 9

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 10: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

  1) Element match –  Match tag and attribute names using string metrics

  2) Linguistic match –  Match tag and attribute names using an external thesaurs (e.g.,

WordNet or a domain specific thesuarus)

  3) Content match –  Match the content of a tag (with respect to the tag) to identify

the concept represented by the content

16/10/11 10

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 11: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

  4) Structural Match –  Adjust the previously computed weighted means by looking to

the structure of both the ontology graph and XML-tree.

16/10/11 11

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 12: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Method: (Semi)-Automatic Annotation

  To summarize: –  using an XML and a DOGMA ontology –  a series of mapping scores are calculated based on element,

linguistic and content match –  Those scores are then refined using the structural match –  The refined scores are then compared against a threshold to

produce the Ω-RIDL mappings.

16/10/11 12

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 13: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

–  using an XML and a DOGMA ontology –  a series of mapping scores are calculated based on element,

linguistic and content match –  Those scores are then refined using the structural match –  The refined scores are then compared against a threshold to

produce the Ω-RIDL mappings.

–  The user can then use the generated mappings to get an idea how his application can commit to the ontology and then decide how to do so.

Method: Summary

16/10/11 13

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 14: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Tool

16/10/11 14

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 15: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Experiment

  Data of the COMDRIVE RFP project –  Holiday Packages in the winter sports domain

16/10/11 15

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 16: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Experiment

  Data of the COMDRIVE RFP project –  Holiday Packages in the winter sports domain

  Ontology developed in several iterations in the project –  Bootstrapping of the ontology –  Meeting with vendor experts –  Meeting with consumer experts

16/10/11 16

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 17: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Experiment

  Data of the COMDRIVE RFP project –  Holiday Packages in the winter sports domain

  Ontology developed in several iterations in the project –  Bootstrapping of the ontology –  Meeting with vendor experts –  Meeting with consumer experts

16/10/11 17

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 18: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Experiment

  Some generated mappings –  map ‘‘/countries/country/sumary/code’’ on

Code identifies / identified by Commodity. –  map ‘‘/countries/country/regions/region’’ on Region. –  map ‘‘/countries/country/regions/region’’ on

Ski Area destination of / with destination Holiday Package. –  map ‘‘/countries/country/regions/region/cities/city’’ City. –  …

16/10/11 18

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 19: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Conclusions

  The four heuristics were able to tackle the considerations mentioned.

  The algorithm depends on a good choice of parameters, otherwise a lot of “nonsense” mappings are generated

  The structural match needs to be revisited to cope with more complicated cases such as: –  map ‘‘/countries/country/regions/region/summary/description’’

on Description of / has RFP.

  Appropriate for suggesting the user mappings (needs testing)

16/10/11 19

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 20: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Future work

  Revision of the structural match

  Integration with tool suite (e.g., Business Semantics Studio)

  Additional testing

16/10/11 20

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations

Page 21: Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.

Pag.

Questions?

16/10/11 21

Comparing XML Files with a DOGMA Ontology to Generate Ω-RIDL Annotations