Ontology Learning Μπαλάφα Κάσσυ Πλασταρά Κατερίνα.

58
Ontology Learning Μπαλάφα Κάσσυ Πλασταρά Κατερίνα
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    225
  • download

    0

Transcript of Ontology Learning Μπαλάφα Κάσσυ Πλασταρά Κατερίνα.

Ontology Learning

Μπαλάφα Κάσσυ

Πλασταρά Κατερίνα

Contents

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

description Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontologies

Provide a formal, explicit specification of a shared conceptualization of a domain that can be communicated between people and heterogeneous and widely spreads application systems.

They have been developed in Artificial Intelligent and Machine Learning to facilitate knowledge sharing and reuse.

Unlike knowledge bases ontologies have “all in one”: formal or machine readable representation full and explicitly described vocabulary full model of some domain consensus knowledge: common understanding of a domain easy to share and reuse

Ontology learning - General

Machine learning of ontologiesMain task: to automatically learn

complicated domain ontologiesExplores techniques for applying

knowledge discovery techniques to different data sources ( html documents, dictionaries, free text, legacy ontologies etc.) in order to support the task of engineering and maintaining ontologies

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontology learning – Technical description

The manual building of ontologies is a tedious task, which can easily result in a knowledge acquisition bottleneck. In addition, human expert modeling by hand is biased, error prone and expensive

Fully automatic machine knowledge acquisition remains in the distant future

Most systems are semi-automatic and require human (expert) intervention and balanced cooperative modeling for constructing ontologies

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Semantic Information Integration

Ontology Engineering

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontology learning – Process (1/2)

Ontology learning – Process (2/2) Stages analysis:

Merging existing structures or defining mapping rules between these structures allows importing and reusing existing ontologies

Ontology extraction models major parts of the target ontology, with learning support fed from various input sources

The target ontology’s rough outline, which results from import, reuse and extraction is pruned to better fit the ontology to its primary purpose

Ontology refinement profits from the pruned ontology but completes the ontology at a fine granularity (in contrast to extraction)

The target application serves as a measure for validating the resulting ontology

The ontology engineer can begin this cycle again- for example, to include new domains in the constructing ontology or to maintain and update its scope

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontology learning – Architecture (1/5)

Ontology learning – Architecture (2/5)

Ontology Engineering Workbench: A sophisticated means for manual modeling and refining of the final ontology. The ontology engineer can browse the resulting ontology from the ontology learning process and decide to follow, delete or modify the proposals as the task requires.

Ontology learning – Architecture (3/5)

Management component: The ontology engineer uses the management component to select input data – that is relevant resources such as HTML and XML documents, DTDs, databases or existing ontologies that the discovery process can further exploit. Then, using the management component the engineer chooses of a set of resource-processing methods available in the resource-processing component and from a set of algorithms available in the algorithm library.

Ontology learning – Architecture (4/5) Resource processing Component: Depending on the

available data the engineer can choose various strategies for resource processing: Index and reduce HTML documents to free text Transform semi-structured documents such as dictionaries into

predefined relational structure Handle semi-structured and structured schema data by

following different strategies for import Process free natural text

After first preprocessing data according to one of these or similar strategies the resource processing module transforms the data into an algorithm specific relational representation.

Ontology learning – Architecture (5/5)

Algorithm library: A collection of various algorithms that work on the ontology definition and the preprocess input data. Although specific algorithms can vary greatly from one type of input to the next, a considerable overlap exists for underlying learning approaches such as associations rules, formal concept analysis or clustering.

Contents

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontology Learning from Natural Language Natural language texts exhibit morphological, syntactic,

semantic, pragmatic and conceptual constraints that interact in order to convey a particular meaning to the reader. Thus, the text transports information to the reader and the reader embeds this information into his background knowledge

Through the understanding of the text, data is associated with conceptual structures and new conceptual structures are learned from the interacting constraints given through language

Tools that learn ontologies from natural language exploit the interacting constraints on the various language levels (from morphology to pragmatics and background knowledge) in order to discover new concepts and stipulate relationships between concepts

Ontology Learning from Semi-structured Data HTML data, XML data, XML DTDs, XML-

Schemata and their likes add - more or less expressive - semantic information to documents

A number of approaches understand ontologies as a common generalizing level that may communicate between the various data types and data descriptions. Ontologies play a major role for allowing semantic access to these vast resources of semi-structured data

Learning of ontologies from these data and data descriptions may considerably enforce the application of ontologies and, thus, facilitate the access to these data

Ontology Learning from Structured Data

The learning of ontologies from metadata, such as database schemata, in order to derive a common high-level abstraction of underlying data descriptions can be an important precondition for data warehousing or intelligent information agents

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Methods for learning ontologies (1/8)

Clustering The elaboration of any clustering method

involves the definition of two main elements- a distance metrics and a classification algorithm

A workbench that supports the development of conceptual clustering methods for the (semi-) automatic construction of ontologies of a conceptual hierarchy type from parsed corpora is the Mo’K workbench

Methods for learning ontologies (2/8)

Clustering Ontologies are organized as multiple

hierarchies that form an acyclic graph where nodes are term categories described by intention and links represent inclusion.

Learning though hierarchical classification of a set of objects can be performed in two main ways: top down, by incremental specialization of classes and bottom-up by incremental generalization

Methods for learning ontologies (3/8) Information Extraction Rules

Methods for learning ontologies (4/8)Information Extraction Rules

We start with: An initial hand crafted seed ontology of

reasonable quality which contains already the relevant types of relationships between ontology concepts in the given domain

An initial set of documents which exemplarily represent (informally) substantial parts of the knowledge represented in the seed ontology

Methods for learning ontologies (5/8)Information Extraction Rules

Compared to other ontology learning approaches this technique is not restricted to learning taxonomy relationships, but arbitary relationships in an application domain.

A project that uses this technique is the FRODO project.

Methods for learning ontologies (6/8)

Association Rules Association-rule-learning algorithms are used for

prototypical applications of data mining and for finding associations that occur between items in order to construct ontologies (extraction stage)

‘Classes’ are expressed by the expert as a free text conclusion to a rule. Relations between these ‘classes’ may be discovered from existing knowledge bases and a model of the classes is constructed (ontology) based on user-selected patterns in the class relations

This approach is useful for solving classification problems by creating classification taxonomies (ontologies) from rules

Methods for learning ontologies (7/8)

Association Rules – Example A classification knowledge based system with

experimental results based on medical data (Suryanto & Compton – Australia)

Ripple Down Rules (RDR) were used to describe classes and their attributes:

Satisfactory lipid profile previous raised LDL noted (LDL <= 3.4)AND(Triglyceride is NORMAL)AND(Max(LDL)>3.4)OR((LDL is NORMAL)AND(Triglyceride is NORMAL)AND(Max(LDL) is

HIGH)

Experts were allowed to modify or add conclusions in order to correct errors

The conclusions of the rules formed the classes of the classification ontology

Methods for learning ontologies (8/8)

Association Rules – Example Ontology learning methodology used:

Firstly, class relations between rules were discovered. There were three basic relations: subsumption/ intersection, mutual exclusivity and similarity

Secondly, more compound relations which appeared interesting using the three basic relations were specified

Finally, instances of these compound relations or patterns were extracted and the class model was assembled

Problems that occurred: Very similar conclusions were sometimes identified as

mutually exclusive in cases where there different values for the same attribute

The method did not consider any other information about the classes themselves

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Ontology learning tools – ASIUM (1/8) Acronym for "Acquisition of Semantic knowledge Using

Machine learning method" The main aim of Asium is to help the expert in the

acquisition of semantic knowledge from texts and to generalize the knowledge of the corpus

Asium provides the expert with an interface which will first help him or her to explore the texts and then to learn knowledge which are not in the texts

During the learning step, Asium helps the expert to acquire semantic knowledge from the texts, like subcategorization frames and an ontology. The ontology represents an acyclic graph of the concepts of the studied domain. The subcategorization frames represent the use of the verbs in these texts

Ontology learning tools – ASIUM (2/8)

Methodology:The input for Asium are syntactically parsed texts from a specific domain. It then extracts these triplets: verb, preposition/function (if there is no preposition), lemmatized head noun of the complement. Next, using factorization, Asium will group together all the head nouns occurring with the same couple verb, preposition/function. These lists of nouns are called basic clusters. They are linked with the couples verb,preposition/ function they are coming from.

Ontology learning tools – ASIUM (3/8)

Methodology:Asium then computes the similarity among all the basic clusters together. The nearest ones will be aggregated and this aggregation is suggested to the expert for creating a new concept. The expert defines a minimum threshold for gathering clusters into concepts. Any learned concepts can contain noise (e.g. mistakes in the parsing), any sub-concepts the expert wants to identify or over-generalization due to aggre- gations may occur,so the expert’s contribution is necessary.

Ontology learning tools – ASIUM (4/8)

Methodology:After this, Asium will have learned the first level of the ontology. Asium computes similarity again but among all the clusters; the old and the new ones in order to learn the next level of the ontology. The cooperative process runs until there are no more possible aggregations. The output of the learning process is an ontology and subcategorization frames. The ontology represents an acyclic graph of the concepts of the studied domain. The subcategorization frames represent the use of the verbs in these texts.

Ontology learning tools – ASIUM (5/8)

Methodology The advantages of this method are twofold:

First, the similarity measure identifies all concepts of the domain and the expert can validate or split them. Next the learning process is, for one part, based on these new concepts and suggests more relevant and more general concepts.

Second, the similarity measure will offer the expert aggregations between already validated concepts and new basic clusters in order to get more knowledge from the corpus.

Ontology learning tools – ASIUM (6/8)

The interfaceThis window allows the expert to validate the concepts learned by Asium.

Ontology learning tools – ASIUM (7/8)

The interfaceThis window displays the list of all the examples covered for the learned concept.This display allows the expert to visualize all the sentences which will be allowed if this class is validated.

Ontology learning tools – ASIUM (8/8)

The interfaceThis window displays the ontology like it actually is in memory i.e. learned concepts and concepts to be proposed for a level (each blue circle represents a class).

Ontology learning tools – TEXT-TO-ONTO (1/8)

It develops a semi-automatic ontology learning from text

It tries to overcome the knowledge acquisition bottleneck

It is based on a general architecture for discovering conceptual structures and engineering ontologies from text

Ontology learning tools – TEXT-TO-ONTO (2/8)

Ontology learning tools – TEXT-TO-ONTO (3/8) Architecture

Ontology learning tools – TEXT-TO-ONTO (4/8)

Architecture - Main components Text & Processing Management Component

The ontology engineer uses that component to select domain texts exploited in the further discovery process.Can choose among a set of text (pre-) processing methods available on the Text Processing Server and among a set of algorithms available at the Learning & Discovering component.The former module returns text that is annotated by XML and XML-tagged is fed to the Learning & Discovering component

Ontology learning tools – TEXT-TO-ONTO (5/8)

Architecture - Main components Text Processing Server

It contains a shallow text processor based on the core system SMES. SMES is a system that performs syntactic analysis on natural language documents

It organized in modules, such as tokenizer, morphological and lexical processing and chunk parsing that use lexical resources to produce a mixed syntactic/semantic information

The results are stored in annotations using XML-tagged text

Ontology learning tools – TEXT-TO-ONTO (6/8)

Architecture - Main components Lexical DB & Domain Lexicon

SMES accesses a lexical database with more than 120.000 stem entries and more than 12.000 subcategorization frames that are used for lexical analysis and chunk parsing

The domain-specific part of the lexicon associates word stems with concepts available in the concept taxonomy and links syntactic information with semantic knowledge that may be further refined in the ontology

Ontology learning tools – TEXT-TO-ONTO (7/8)Architecture - Main components

Learning & Discovering component Uses various discovering methods on the annotated

texts e.g. term extraction methods for concept acquisition.

Ontology learning tools – TEXT-TO-ONTO (8/8)Architecture - Main components

Ontology Engineering Enviroment-ONTOEDIT Supports the ontology engineer in semi-automatically

adding newly discovered conceptual structures to the ontology

Internally stores modeled ontologies using an XML serialization

Introduction – Ontologies, Ontology learning Technical description Ontology learning in the Semantic Information

descritpion Ontology Learning – Process Ontology Learning - Architecture Ontology Learning data sources Methods used in ontology learning Tools of ontology learning Uses of ontology learning

Uses of ontology learning – Knowledge sharing (1/2)

Identifying candidate relations between expressive, diverse ontologies using concept cluster integration in multi-agent systems

Agents with diverse ontologies should be able to share knowledge by automated learning methods and agent communication strategies

Agents that do not know the relationships of their concepts to each other need to be able to teach each other these relationships (ontology learning)

Uses of ontology learning – Knowledge sharing (2/2)

Concept representation and learning on each agent:

Process: an agent sends a query to another agent and receives a response with new concepts. A new category is created from these concepts. The agent re-learns the ontology rules and if the new concept relation rules are verified, they are stored in the agent.

Uses of ontology learning – Interest matching (1/2) Designing a general algorithm for interest

matching is a major challenge in building online community and agent-based communication networks.

These algorithms can be applied in user categorization for an online community . Users’ behavior can be analyzed and matched against other users to provide collaborative categorization and recommendation services to tailor and enhance the online experience.

The process of finding similar users based on data from logged behavior in called interest matching.

Uses of ontology learning – Interest matching (2/2)User interests can be

described by ontologies as weighed tree- hierarchies of concepts

Each node has a weight attribute to represent the importance of the concept

These weights can be explored to calculate similarities between users

Learning process: a standard ontology is used and the websites the user visits can be classified and entered into the standard ontology to personalize it – if a user frequents websites of a category (instance of a class) it is likely he is interested in other instances of the class

Uses of ontology learning – Web Directory Classification Ontologies and ontology learning can be used to

create information extraction tools for collecting general information from the free text of web pages and classifying them in categories

The goal is to collect indicator terms from the web pages that may assist the classification process. This terms can be derived from directory headings of a web page as well as its content.

The indicator terms along with a collection of interpretation rules can result in a hierarchy (ontology) of web pages.

Uses of ontology learning –E-mail classification (1/2)

KMi Planet A web-based news server for communication

of stories between member in Knowledge Media Institute

Main goal: To classify an incoming story, obtain the relevant objects within the story, deduce the relationships between them and to populate the ontology

Integrate a template-driven information extraction engine with an ontology engine to supply the necessary semantic content

Uses of ontology learning –E-mail classification (2/2)

KMi Planet There are three tools:

PlanetOnto MyPlanet an IE tool

PlanetOnto supports some activities.One of them is Ontology editing.In that point ontology learning is concerned.

A tool called WebOnto provides Web-based visualisation, browsing and editing support for the ontology. The “Operational Conceptual Modelling Language”, OCML, is a language designed for knowledge modeling. WebOnto uses OCML and allows the creation of classes and instances in the ontology, along with easier development and maintenance of the knowledge models

Bibliography

M.Sintek, M. Junker, Ludger van Est, A. Abecker, Using Information Extraction Rules for Extending Domain Ontologies, German Research Center for Artificial Intelligence (DFKI)

M.Vargas-Vera, J.Domingue, Y.Kalfoglou, E.Motta, S.Buckingham Shum, Template-Driven Information Extraction for Populating Ontologies, Knowledge Media Institute (UK)

G.Bisson, C.Nedellec, Designing clustering methods for ontology building, University of Paris

A.Maedche, S.Staab, The TEXT-TO-ONTO Ontology Learning Environment, University of Karlsruhe

A.Maedche, S.Staab, Ontology Learning for the Semantic Web, University of Karlsruhe

H.Suryanto,P.Compton, Learning classification taxonomies from a classification knowledge based system, University of New South Wales (Australia)

Proceedings of the First Workshop on Ontology Learning OL'2000Berlin, Germany, August 25, 2000

Proceedings of the Second Workshop on Ontology Learning OL'2001Seattle, USA, August 4, 2001

ASIUM web page http://www.lri.fr/~faure/Demonstration.UK/Presentation_Demo.html