TOPIC MAPS CONCEPThy566/Deliverables/Excer1/Ontopia.doc · Web viewBy adding the topic/occurrence...

ΕΡΓΑΣΙΑ ΣΤΟ ΜΑΘΗΜΑ ΗΥ566 –ΔΙΑΧΕΙΡΙΣΗ ΓΝΩΣΗΣ ΣΕ ΕΝΔΟΕΠΙΧΕΙΡΗΣΙΑΚΑ ΔΙΚΤΥΑ ΚΑΙ ΤΟ ΔΙΑΔΙΚΤΥΟ

«ΠΡΟΪΟΝΤΑ ΚΑΙ ΥΠΗΡΕΣΙΕΣ ΤΗΣ ΕΤΑΙΡΕΙΑΣ ONTOPIA»

Κυριάκος Ε. Κρητικος – Υποψήφιος Διδάκτωρ

Μίλτος Στρατάκης – Μεταπτυχιακός Φοιτητής

Πέμπτη, 3 Απριλίου 2003

TABLE OF CONTENTS

INTRODUCTION 7

TOPIC MAPS 11

TOPIC MAPS CONCEPT 11INTRODUCTION 11HISTORY 12TOPIC MAP MODEL INSPIRATION 13

INDEXES 13GLOSSARIES AND THESAURI 14SEMANTIC NETWORKS 14

BASIC CONCEPTS 15TOPICS 16

TOPIC TYPES 16TOPIC NAMES 17

OCCURRENCES 18OCCURRENCES ROLES 19

ASSOCIATIONS 19ASSOCIATION 20ASSOCIATION TYPES 20ASSOCIATION ROLES 22

MERGING 22ADDITIONAL CONCEPTS 23

SUBJECT IDENTITY AND PUBLISHED SUBJECTS 23FACETS 24SCOPE 25

POWER OF THE TOPIC MAP MODEL 26TOPIC MAP MODEL AMBIGUITIES 27

COMPARISON WITH RDF 28DIFFERENT ROOTS AND PERSPECTIVES 28DIALECTICALLY OPPOSED POINTS OF VIEW 28THINGS 29RELATIONSHIPS 29ATTRIBUTES 30KINDS OF THINGS 31CONTEXT 31REIFICATION 32ASSOCIATION OCCURRENCE 32PORTABILITY 33INTEROPERABILITY 33ABBREVIATED SYNTAX 34NO USEFUL GENERAL MAPPINGS 34USEFUL SCHEMA LEVEL MAPPINGS 34

PRODUCTS 35

Ontopia Topic Map Engine 35Features 36Areas of application 36Architecture 37Ontopia Topic Map Engine complement tools 37

The Ontopia RDBMS Backend Connector 38The Ontopia Full-text Search Integration 38

3

The Ontopia Topic Map Query Engine 39The Ontopia Schema Tools 39

Ontopia Navigator Framework 40Features 40Ontopia Navigator Framework complement tools 41

The Ontopia Topic Map Web Editor Framework 41

The Omnigator 41How does it work? 42Browsing the Omnigator 42

First Page: The Welcome Page 42Selection of a specific topic map: The Topic Map Page 43The Topic Page of a topic type 45The Topic Page of an individual topic 46The Topic Page of an occurrence type 48The Topic Page of an association type 49

Advanced Omnigator Topics 50The Manage Page 50The Plug-ins Page 51The Full-text Indexing Plug-in 51The Customise Page 52The Plug-ins 53

The Statistics plug-in 54The Filter plug-in 54The Merge plug-in 56The Export plug-in 56The Reload plug-in 56The Query plug-in 56The Validate Plug-in 57The Full-text Search Plug-in 57

REFERENCES 59

4

TABLE OF FIGURES

Figure 1 - Topic Types.................................................................................................17Figure 2 - Topic Names................................................................................................18Figure 3 - Occurrences.................................................................................................18Figure 4 - Occurrence Roles........................................................................................19Figure 5 - Topic Associations......................................................................................20Figure 6 - Association Types.......................................................................................21Figure 7 - Topic maps as portable semantic networks.................................................22Figure 8 - Applying facets for filtering........................................................................25Figure 9 - Scoping topic names, occurrences and associations...................................26Figure 10 - OKS Overview..........................................................................................35Figure 11 - Ontopia Topic Map Engine architecture...................................................37Figure 12 - Omnigator’s Welcome Page......................................................................43Figure 13 - The Topic Map Page of opera.xtm (Ontology view)................................44Figure 14 - The Topic Page of ‘composer’ topic.........................................................45Figure 15 - The Topic Page of ‘Puccini’ topic.............................................................47Figure 16 - The Topic Page of the ‘AZ opera synopsis’ occurrence type...................48Figure 17 - The Topic Page of the ‘death of character’ association type....................49Figure 18 - The Manage Page......................................................................................50Figure 19 - The Full-text Indexing Plug-in..................................................................52Figure 20 - The Customise Page..................................................................................53Figure 21 - The Statistics plug-in.................................................................................54Figure 22 - The Set Context Page................................................................................55Figure 23 - The results of the above tolog query.........................................................57Figure 24 - Full-text search results of text ‘tosca’.......................................................58

5

INTRODUCTION

ONTOPIA [6] is a big company situated at the city of OSLO in NORWAY. It has a comprehensive suite of topic map solutions, both software products and consultancy and training services.

Its software consists of a modular topic map management system and a series of SDKs to enable its customers and partners to rapidly create a broad range of solutions across all kinds of organisations.

The Ontopia Topic Map Engine provides a standards-based high-level schema for modelling the enterprise business systems and their interactions, and the pluggable backend allows application programmers to work with a single API regardless of the nature of the business data. By providing a thin translation layer that maps data entities to the topic map model of the business, developers can enable:

Standards-based information interchange between business systems. A consistent interface to all business systems which enables the development

of intelligent agent applications. Associative connections between data entities to allow a truely integrated

enterprise information portal. Development of knowledge management systems which allow users to share

their interpretations of business data.

Its consultancy and training is delivered by a team of highly knowledgeable and experienced staff. The training products range from briefings and one-day seminars to detailed, customised workshops. Our consultancy can address business process and knowledge management issues as well as the practical project tasks of project management and implementation.

The main partners of ONTOPIA are the following:

ISOGEN International (USA and Europe)ISOGEN International is a leading service provider of application development, systems integration, consulting, and training. ISOGEN solves diverse business problems and delivers exceptional economic value with various XML, Topic Maps, and other standards-based methods and technologies for content management, knowledge management, document management, and other hypermedia processing. ISOGEN's parent company, Innodata Corp, is a world leader in content outsourcing services employing large-scale content refining techniques to provide value within the same XML systems. Contact person: Dan Dube (North America) Contact person: Marit Mobedjina (Europe)

Synergy Incubate (Japan)Synergy Incubate is one of the leading providers of information management tools and services in Japan. Synergy Incubate, with its XML expertise, provides platforms for sharing information and knowledge; network security features, such as document authentication and notarization; and multilingual support for software. It has provided products and services based on Topic

7

http://web.synergy.co.jp/

mailto:[email protected]


http://www.isogen.com/

http://www.ontopia.net/solutions/workshops.html

http://www.ontopia.net/solutions/consultancy.html

http://www.ontopia.net/solutions/products.html

http://www.isogen.com/

http://web.synergy.co.jp/

Maps since 2000.Contact person: Motomu Naito

Innodigital (Korea)INNNODIGITAL CO., LTD. is a company specialized in developing and providing solutions in KMS and EDMS fields since its establishment in 1995. Innodigital has accumulated the largest business accomplishments in the Korean KMS/EDMS industry by successfully providing Integrated Information Management System to organizations companies including above mentioned leading companies in Korea.Contact person: Jong. Kim

Eurostep (Sweden, UK, Finland, Germany, USA)Eurostep is a consulting and software company specialising in information management. The vision of Eurostep is to be the leading competence provider in Strategic Planning, Design and Implementation Of Open Solutions, to support Open People in Open Organisations.Contact person: Helena Lindström

Diderot Track (Netherlands)Diderot Track BV specializes in advising and supporting organizations that wish to establish or enhance their computer information processes and systems. The emphasis cannot be on the technical issues alone. An open eye and ear for organisational adaptation is equally important. Diderot Track is a member of OASIS.Contact person: Aad Kamsteeg

Docufy (Germany)Docufy is a vendor-independent professional services company specialized in providing its customers with tailored solutions to address their needs in the areas of collaborative content management and single source publishing, the optimization of editorial processes and the creation of topic maps. Our custom developed systems combine document management, collaboration, workflow and business process automation in a single integrated solution using standards like XML, XSLT, and Java.Contact person: Uwe Reissenweber

Techquila (UK)Techquila provides high-quality consultancy, training and systems development services which make use of standards-based technology such as XML, XSLT and topic maps.Contact person: Kal Ahmed

XMLmethods.com (USA)XMLmethods provides a wide range of XML-related consultancy, including: determination of department or corporate XML strategy, DTD/schema/topic map analysis and design, conversion

8

http://www.xmlmethods.com/


http://www.techquila.com/


http://www.docufy.de/


http://www.diderottrack.nl/


http://www.eurostep.com/


http://www.innodigital.co.kr/


http://www.innodigital.co.kr/

http://www.eurostep.com/

http://www.diderottrack.nl/

http://www.docufy.de/

http://www.techquila.com/

http://www.xmlmethods.com/

strategies and script development, product evaluation, authoring environment development, formatting specification development. Contact person: Pamela L. Gennusa

Ligent (USA)Ligent is a management consulting and systems integration firm dedicated to helping people work smarter. Our goal is to create measurable improvements in our client's business through our focus on people, process and information technology. Ligent's expertise includes the business intelligence disciplines of content management, knowledge management and competitive intelligence. From offices located in the Chicago metropolitan area, we serve clients throughout the United States and Canada. Contact person: Kevin Trainor

In this report, we are going to bring into light the main Semantic Web technology used by ONTOPIA, the Topic Maps. We will begin by representing the Topic Map model and then we will compare this model with the RDF model. Then, we will emphasize on the description of the main Semantic Web products of Ontopia by providing detailed report on their capabilities, their architecture and their application areas.

9


http://www.ligent.net/


http://www.ligent.net/

TOPIC MAPS

TOPIC MAPS CONCEPT

INTRODUCTION

Someone once said that “a book without an index is like a country without a map”. (This intro is from paper [3])

However interesting and worthwhile the experience of driving from A to B without a map might be in its own right, there can be no doubt that when the goal is to arrive at one's destination as quickly as possible (or at least without undue delay), some kind of a map is indispensable.

Similarly, if you are looking for a particular piece of information in a book (as opposed to enjoying the experience of reading it from cover to cover), a good index is an immense asset. The traditional back-of-book index can be likened to a carefully researched and hand-crafted map, and the task of the indexer, as Larry Bonura puts it [Bonura 1994], “to chart[ing] the topics of the document and [presenting] a concise and accurate map for readers.” In Troilus and Cressida Shakespeare used a different metaphor:

And in such indexes (although small pricksTo their subsequent volumes) there is seenThe baby figure of the giant mass

Of things to come at large

but also here there is the same sense of the index replicating, in miniature, the structures of its subject, in order to provide a more manageable view of the whole. Perhaps it isn't surprising that Shakespeare chose not to use the map metaphor. After all, the art of cartography was still in its infancy in his time ... and so too were communications. Today the situation is quite different; the sheer speed of modern communications makes accurate and advanced mapping techniques of major importance. One answer to this problem in the realm of transportation is the GPS (Global Positioning System). The answer in the realm of publishing and information management is the new international standard, Topic Maps [ISO 13250].

Up until now there has been no equivalent of the traditional back-of-book index in the world of electronic information. True enough, people have marked up keywords in their word processing documents and used these to generate indexes “automatically”, but the resulting indexes have remained firmly within the paradigm of single documents destined to be published on paper. The world of electronic information is quite different, as the World Wide Web has taught us. Here the distinction between individual documents vanishes and the requirement is for indexes to span multiple documents, and in some cases, to cover vast pools of information, which in turn calls for the ability to merge indexes and to create user-defined views of

11

http://www.ontopia.net/topicmaps/materials/tao.html#iso13250

http://www.ontopia.net/topicmaps/materials/tao.html#bon1994

information. In this situation, old-fashioned indexing techniques are pitifully inadequate.

The problem has been recognized for several decades in the realm of document processing, but the methodology used to address it — full text indexing — has only solved part of the problem, as anyone who has used search engines on the internet knows only too well.

The main problem with full text indexes is their lack of discrimination. They index everything: Imagine creating a traditional back-of-book index by taking every single word in the book, removing a couple of hundred of the most obviously useless ones, and then including every single usage of those that remain. Even with some intelligence to allow for inflected forms the result would be of no practical use whatsoever. Mechanical indexing cannot cope with the fact that the same subject may be referred to by multiple names (the “synonym problem”), nor that the same name may refer to multiple subjects (the “homonym problem”). And yet this is basically how a web search engine works (no wonder you always get thousands of irrelevant hits and still manage to miss the thing you are looking for!).

That is why new methodologies are called for. Topic maps provide an approach that marries the best of several worlds, including those of traditional indexing, library science and knowledge representation, with advanced techniques of linking and addressing. It is expected that they will become as indispensable for tomorrow's information providers as maps for the traveller. And once topic maps have become ubiquitous, they will indeed constitute the GPS of the information universe.

HISTORY

Topic maps have a long and complicated history, beginning with the Davenport group, which in 1991 started a process to create a standard SGML DTD for software documentation. This group quite quickly spun off an offshoot called Conventions for the Application of HyTime (CApH), one of whose tasks was to design an application for computerized back-of-book indexes. These indexes were intended to have one novel feature: it should be possible to merge them automatically. The ideas behind this application were what eventually became topic maps.

CApH worked on the concept for a long time, before topic maps was accepted by ISO's SGML working group as a new work item in 1996. ISO then spent another four years working on the standard before it was approved as ISO/IEC 13250:2000 in January 2000 ([ISO13250]. Topic maps then had the form of an SGML architecture based on HyTime. Work was later done by an informal organization known as TopicMaps.Org, which produced the XML Topic Maps (XTM) syntax for topic maps ([XTM1.0]). This was a reformulation of topic maps in XML syntax based on XLink. This syntax has since been accepted by ISO into ISO 13250 as an annex. The XTM syntax is used by nearly all topic map software today, while use of the original SGMLsyntax is rare.

Topic maps have many applications, but one of their main applications is that of solving the findability problem of information, that is: how to find the information you are looking for in large body of information. Topic maps can also be used for knowledge management, for web portal development, content management, and enterprise application integration (EAI). Topic maps are also being described as an enabling technology for the semantic web. (This History section is provided by [5]).

12

TOPIC MAP MODEL INSPIRATION

A Topic Map is functionally equivalent to multi-document indexes, glossaries, and thesauri. Thus its model is based on key concepts of these navigational aids. Additionally, it bridges the gap between knowledge representation and information management by combining the approaches of mapping knowledge structures that exist in information resources with the one used to represent knowledge as in order to enable communication between people and machines. (From paper [3])

INDEXES

A traditional index is in fact a map of the knowledge contained in a book; it lists the topics covered, by whatever name users might be expected to wan t to look them up, and includes salient (and only salient) references to those topics. The main constituents of this and any index are:

1. an (alphabetical) list of names of topics, and 2. references to occurrences of those topics

Some possible additional features of an index can be:

typographical conventions used to distinguish between different types of topic; similarly, typographical conventions used to distinguish between different

types of occurrence (e.g. references to synopses shown in bold); the use of see references handles synonyms by allowing multiple points of

entry (by different names) to the same topic; see also references point to associated topics; subentries provide an alternative mechanism for pointing out associations

between different topics (e.g. between a composer and his works, or between supertypes and subtypes).

A book may contain multiple indexes, for example an index of names, an index of places, and an index of subjects. This mechanism provides an alternative to the use of typographic conventions for distinguishing between topics of different types in one and the same index.

Homonyms can be distinguished through the use of explanatory labels following the names, e.g. “Tosca (opera)” and “Tosca (character)”.

The locators (page numbers) may contain modifiers that help distinguish between different types of occurrence, for example “54n” for a footnote on page 54. Again, this is an alternative to the use of different typefaces.

The nature of an occurrence (i.e., the way in which the information is pertinent to its subject) might also be shown using a subentry mechanism ([Goldfarb 1990], for example, makes heavy use of subentries for typing occurrences as “clause”, “defined in”, “defined in glossary”, “used in production”, etc.).

13

http://www.ontopia.net/topicmaps/materials/tao.html#gol1990

http://www.ontopia.net/topicmaps/materials/tao.html#gol1990

The key features of a typical index are thus: topics (identified by their names, of which there may be more than one); associations between topics; and occurrences of topics (pointed to via locators). For each of these constructs it is useful to be able to say something about the type, in order to convey more information to the user.

Topics, Associations and Occurrences are also the key constructs in the topic map model. So it can be educed that TM’s are fuctionally equivalent to multiple indexes.

GLOSSARIES AND THESAURI

A glossary is basically a list of terms and definitions. It can be thought of as a kind of index in which only one type of occurrence is of interest (the one that provides the “definition”), and which therefore includes the occurrence inline (instead of pointing to it via a locator). Like an index, a glossary may also contain see and see also references to associated topics. It can also (as in this case) contain additional information relevant to the term itself, such as its language or pronunciation, but the key elements are the topic names and their definitions.

A thesaurus, on the other hand, emphasizes other aspects of an index. It is basically a network of interrelated terms within a particular domain, and although it will often contain other information (such as definitions, examples of usage, etc.), the key feature of a thesaurus is the relationships, or associations, between terms. Given a particular term, a thesaurus will indicate which other terms mean the same, which terms denote a broader category of the same kind of thing, which denote a narrower category, and which are related in some other way. The special thing about associations in a thesaurus (as compared to associations found in a typical index or glossary) is that they are typed. This is important because it makes it possible not only to say that two terms are related, but also how or why they are related. It also makes it possible to group together terms that are associated in the same way, thus making navigation much easier. Commonly used association types like “broader term”, “narrower term”, “used for” and “related term” are defined in standards for thesauri such as [Z39.19], [ISO 5964] and [ISO 2788].

SEMANTIC NETWORKS

Indexes, glossaries and thesauri are all ways of mapping the knowledge structures that exist implicitly in books and other sources of information. In the field of AI (Artificial Intelligence) there also exists the need to be able to represent knowledge (and meaning), in order to support communication between people and machines. One widely used knowledge representation formalism is that of conceptual graphs, whose building blocks are concepts and conceptual relations.

Similar graph structures have been implemented in various forms under names such as “semantic nets”, “associative nets”, “partioned nets” and “knowledge” (or “conceptual”) “maps” in many AI systems. The earliest forms, called existential graphs, were invented by the philosopher Charles Sanders Peirce at the end of the 19th century as a graphical notation for symbolic logic. One of the most completely worked out schemes, the conceptual graphs developed by John Sowa and his

14



http://www.ontopia.net/topicmaps/materials/tao.html#z39.19

collaborators ([Sowa 2000]), is claimed to be completely isomorphic with first order logic.

Since the basic model of semantic networks is very similar to that of the topics and associations found in indexes, combining the two approaches should provide great benefits in both information management and knowledge management, and this is precisely what the new topic map standard achieves. By adding the topic/occurrence axis to the topic/association model, topic maps provide a means of “bridging the gap”, as it were, between knowledge representation and the field of information management.

“Knowledge management” is of course one of today's buzzwords and a term that often involves not a little marketing hype. For the big consulting companies, knowledge management is essentially about new business management techniques designed to address the fact that people (and the expertise they possess) are the primary assets in an increasingly knowledge-based economy. Others equate knowledge management with information management (especially some vendors of information management tools, who are only too happy to slap a new label on their boxes).

But knowledge is fundamentally different from information: the difference is that between knowing a thing versus simply having information about it. And if, as one writer claims ([Ruggles 1997]), “knowledge management covers three main knowledge activities: generation, codification, and transfer”, then topic maps can be regarded as the standard for codification that is the necessary prerequisite for the development of tools that assist in the generation and transfer of knowledge.

BASIC CONCEPTS

The purpose of a topic map is to convey knowledge about resources through a superimposed layer, or map, of the resources. A topic map captures the subjects of which resources speak, and the relationships between subjects, in a way that is implementation-independent.(From paper [4])

The key concepts in topic maps are topics, associations, and occurrences.A topic is a resource within the computer that stands in for (or “reifies”) some

real-world subject. Examples of such subjects might be the play Hamlet, the playwright William Shakespeare, or the “authorship” relationship.

Topics can have names. They can also have occurrences, that is, information resources that are considered to be relevant in some way to their subject. Finally, topics can participate in relationships, called associations, in which they play roles as members.

Thus, topics have three kinds of characteristics: names, occurrences, and roles played as members of associations. The assignment of such characteristics is considered to be valid within a certain scope, or context.

Topic maps can be merged. Merging can take place at the discretion of the user or application (at runtime), or may be indicated by the topic map's author at the time of its creation.

15

http://www.topicmaps.org/xtm/1.0/#desc-scope

http://www.topicmaps.org/xtm/1.0/#desc-topic-characteristic

http://www.topicmaps.org/xtm/1.0/#desc-member

http://www.topicmaps.org/xtm/1.0/#desc-topic-name

http://www.topicmaps.org/xtm/1.0/#desc-subject

http://www.topicmaps.org/xtm/1.0/#desc-occurrence

http://www.topicmaps.org/xtm/1.0/#desc-association

http://www.topicmaps.org/xtm/1.0/#desc-topic

http://www.ontopia.net/topicmaps/materials/tao.html#rug1997

http://www.ontopia.net/topicmaps/materials/tao.html#sowa2000

TOPICS

What then is a topic? A topic, in its most generic sense, can be any “thing” whatsoever — a person, an entity, a concept, really anything — regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. You can't get much more general than that!

In fact, this is almost word for word how the topic map standard defines subject, the term used for the real world “thing” that the topic itself stands in for. We might think of a “subject” as corresponding to what Plato called an idea. A topic, on the other hand, is like the shadow that the idea casts on the wall of Plato's cave: It is an object within a topic map that represents a subject. In the words of the standard: “The invisible heart of every topic link is the subject that its author had in mind when it was created. In some sense, a topic reifies a subject...”

Strictly speaking, the term “topic” refers to the object or node in the topic map that represents the subject being referred to. However, there is (or should be) a one-to-one relationship between topics and subjects, with every topic representing a single subject and every subject being represented by just one topic. To a certain degree, therefore, the two terms can be used interchangeably.[3]

So, in the context of a dictionary of opera, topics might represent subjects such as “Tosca”, “Madame Butterfly”, “Rome”, “Italy”, the composer “Giacomo Puccini”, or his birthplace, “Lucca”: that is, anything that might have an entry in the dictionary — but also much else besides.

TOPIC TYPES

Topics can be categorized according to their kind. In a topic map, any given topic is an instance of zero or more topic types. This corresponds to the categorization inherent in the use of multiple indexes in a book (index of names, index of works, index of places, etc.), and to the use of typographic and other conventions to distinguish different types of topics.

Thus, Puccini would be a topic of type “composer”, Tosca and Madame Butterfly topics of type “opera”, Rome and Lucca topics of type “city”, Italy a topic of type “country”, etc. In other words, the relationship between a topic and its type is a typical class-instance relationship.

Exactly what one chooses to regard as topics in any particular application will vary according to the needs of the application, the nature of the information, and the uses to which the topic map will be put: In a thesaurus, topics would represent terms, meanings, and domains; in software documentation they might be functions, variables, objects, and methods; in legal publishing, laws, cases, courts, concepts, and commentators; in technical documentation, components, suppliers, procedures, error conditions, etc.

Topic types are themselves defined as topics by the standard. You must explicitly declare “composer”, “opera”, “city”, etc. as topics in your topic map if you want to use them as types (in which case you will be able to say more about them using the topic map model itself).

Topics have three kinds of characteristics: names, occurrences, and roles in associations.

16

http://www.ontopia.net/topicmaps/materials/tao.html#footnote3

Figure 1 - Topic Types

TOPIC NAMES

Normally topics have explicit names, since that makes them easier to talk about.[4] However, topics don't always have names: A simple cross reference, such as “see page 97”, is considered to be a link to a topic that has no (explicit) name.

Names exist in all shapes and forms: as formal names, symbolic names, nicknames, pet names, everyday names, login names, etc. The topic map standard doesn't pretend to try to enumerate and cover them all. Instead, it recognizes the need for some forms of name (that have particularly important and universally understood semantics) to be defined in a standardized way, in order for applications to be able to do something meaningful with them, and at the same time the need for complete freedom and extensibility to be able to define application-specific name types.

The standard therefore provides the facility to assign multiple base names to a single topic, and to provide variants of each base name for use in specific processing contexts. In the original ISO standard variants were limited to display name and sort name. XTM offers a more general variant name mechanism.

The ability to be able to specify more than one topic name can be used to indicate the applicability of different names in different contexts or scopes (about which more later), such as language, style, domain, geographical area, historical period, etc. A corollary of this feature is the topic naming constraint, which states that no two subjects can have exactly the same base name in the same scope.

17


Figure 2 - Topic Names

OCCURRENCES

A topic may be linked to one or more information resources that are deemed to be relevant to the topic in some way. Such resources are called occurrences of the topic.

An occurrence could be a monograph devoted to a particular topic, for example, or an article about the topic in an encyclopaedia; it could be a picture or video depicting the topic, a simple mention of the topic in the context of something else, a commentary on the topic (if the topic were a law, say), or any of a host of other forms in which an information resource might have some relevance to the subject in question.

Such occurrences are generally external to the topic map document itself (although they may also be inside it), and they are “pointed at” using whatever mechanisms the system supports, typically URIs (in XTM) or HyTime addressing (in HyTM). Today, most systems for creating hand-crafted indexes (as opposed to full text indexes) use some form of embedded markup in the document to be indexed. One of the advantages to using topic maps is that the documents themselves do not have to be touched.

Figure 3 - Occurrences

18

An important point to note here is the separation into two layers of the topics and their occurrences. This separation is one of the clues to the power of topic maps and we shall return to it later.

OCCURRENCES ROLES

Occurrences, as we have already seen, may be of any number of different types (we gave the examples of “monograph”, “article”, “illustration”, “mention” and “commentary” above). Such distinctions are supported in the standard by the concepts of occurrence role and occurrence role type.

Figure 4 - Occurrence Roles

The distinction between an occurrence role and its type is subtle but important (at least in HyTM). In general terms they are both “about” the same thing, namely the way in which the occurrence contributes information to the subject in question (e.g. through being a portrait, an example or a definition). However, the role (indicated syntactically in HyTM by the role attribute) is simply a mnemonic; the type (indicated syntactically by the type attribute), on the other hand, is a reference to a topic which further characterizes the nature of the occurrence's relevance to its subject. In general it makes sense to specify the type of the occurrence role, since then the power of topic maps can be used to convey more information about the relevance of the occurrence.

Note: The concept of occurrence role is not present in XTM, since it was regarded as

being an artifact of the use of HyTime in the original standard. The concept of “occurrence role type” was retained, but the term itself was changed to “occurrence type” in order to reduce the possibility of confusion with association roles (see below).

ASSOCIATIONS

Up to now, all the constructs that have been discussed have had to do with topics as the basic organizing principle for information. The concepts of “topic”,

19

“topic type”, “name”, “occurrence” and “occurrence role” allow us to organize our information resources according to topic (or subject), and to create simple indexes, but not much more.[5]

The really interesting thing, however, is to be able to describe relationships between topics, and for this the topic map standard provides a construct called the topic association.

ASSOCIATION

A topic association asserts a relationship between two or more topics. Examples might be as follows:

“Tosca was written by Puccini” “Tosca takes place in Rome” “Puccini was born in Lucca” “Lucca is in Italy” “Puccini was influenced by Verdi”

INCLUDEPICTURE "The%20TAO%20of%20Topic%20Maps.files/tao-assocs.jpg" \* MERGEFORMATINET

Figure 5 - Topic Associations

ASSOCIATION TYPES

Just as topics and occurrences can be grouped according to type (e.g., composer/opera/country and mention/article/commentary, respectively), so too can associations between topics be grouped according to their type. The association type for the relationships mentioned above are written_by, takes_place_in, born_in, is_in (or geographical containment), and influenced_by. As with most other constructs in the topic map standard, association types are themselves defined in terms of topics.

The ability to do typing of topic associations greatly increases the expressive power of the topic map, making it possible to group together the set of topics that

20


have the same relationship to any given topic. This is of great importance in providing intuitive and user-friendly interfaces for navigating large pools of information.

It should be noted that topic types are regarded as a special (i.e. syntactically privileged) kind of association type; the semantics of a topic having a type (for example, of Tosca being an opera) could equally well be expressed through an association (of type “type-instance”) between the topic “opera” and the topic “Tosca”. The reason for having a special construct for this kind of association is the same as the reason for having special constructs for certain kinds of names (indeed, for having a special construct for names at all): The semantics are so general and universal that it is useful to standardize them in order to maximize interoperability between systems that support topic maps.

Figure 6 - Association Types

It is also important to note that while both topic associations and normal cross references are hyperlinks, they are very different creatures: In a cross reference, the anchors (or end points) of the hyperlink occur within the information resources (although the link itself might be outside them); with topic associations, we are talking about links (between topics) that are completely independent of whatever information resources may or may not exist or be considered as occurrences of those topics.

Why is this important?Because it means that topic maps are information assets in their own right,

irrespective of whether they are actually connected to any information resources or not. The knowledge that Rome is in Italy, that Tosca was written by Puccini and is set in Rome, etc. etc. is useful and valuable, whether or not we have information resources that actually pertain to any of these topics.

Also, because of the separation between the information resources and the topic map, the same topic map can be overlaid on different pools of information, just as different topic maps can be overlaid on the same pool of information to provide different “views” to different users. Furthermore, this separation provides the potential to be able to interchange topic maps among publishers and to merge one or more topic maps.[6]

21


Figure 7 - Topic maps as portable semantic networks

ASSOCIATION ROLES

Each topic that participates in an association plays a role in that association called the association role. In the case of the relationship “Puccini was born in Lucca”, expressed by the association between Puccini and Lucca, those roles might be “person” and “place”; for “Tosca was composed by Puccini” they might be “opera” and “composer”. It will come as no surprise now to learn that association roles can also be typed and that the type of an association role is also a topic!

Unlike relations in mathematics, associations are inherently multidirectional. In topic maps it doesn't make sense to say that A is related to B but that B isn't related to A: If A is related to B, then B must, by definition, be related to A. Given this fact, the notion of association roles assumes even greater importance. It is not enough to know that Puccini and Verdi participate in an “influenced-by” association; we need to know who was influenced by whom, i.e. who played the role of “influencer” and who played the role of “influencee”.

This is another way of warning against believing that the names assigned to association types (such as “was influenced by”) imply any kind of directionality. They do not! This particular association type could equally well (under the appropriate circumstances) be characterized by the name “influenced” (as in “Verdi influenced Puccini”).

MERGING

Topic maps can be merged. Merging can take place at the discretion of the user or application (at runtime), or may be indicated by the topic map's author at the time of its creation.

The term merging covers two distinct processes:

1. The process of merging two topic maps, either as a result of explicit <mergeMap> directives, or for any application-specific reasons.

2. The process of merging two topics

22

. The rules governing all forms of merging and the determination of subject

identity are given in full in Annex F: XTM Processing Requirements. They can be briefly (and incompletely) stated as follows:

1. When two topic maps are merged, any topics that the application, by whatever means, determines to have the same subject are merged, and any duplicate associations are removed.

2. When two topics are merged, the result is a single topic whose characteristics are the union of the characteristics of the original topics, with duplicates removed.

Two topics are always deemed to have the same subject if:1. they have one or more subject indicators in common, 2. they reify the same addressable subject, or 3. they have the same base name in the same scope.

ADDITIONAL CONCEPTS

The additional concepts of Topic Maps are: Identity, Facets, Scope and are usually called as IFS.

SUBJECT IDENTITY AND PUBLISHED SUBJECTS

The goal with topic maps is to achieve a one-to-one relationship between topics and the subjects that they represent, in order to ensure that all knowledge about a particular subject can be accessed via a single topic. However, sometimes the same subject is represented by more than one topic, especially when two topic maps are being merged. In such a situation it is necessary to have some way of establishing the identity between seemingly disparate topics. For example, if reference works publishers from Norway, France and Germany were to merge their topic maps, there would be a need to be able to assert that the topics “Italia”, “l'Italie” and “Italien” all refer to the same subject.

The concept that enables this is that of subject identity. When the subject is an addressable information resource (an “addressable subject”), its identity may be established directly through its address. However most subjects, such as Puccini, Italy, or the concept of opera, are not directly addressable. This problem is solved through the use of subject indicators (originally called “subject descriptors” in ISO 13250). A subject indicator is “a resource that is intended ... to provide a positive, unambiguous indication of the identity of a subject.” Because it is a resource, a subject indicator has an address (usually a URI) that can be used as a “subject identifier”.

Any two topics that share one or more subject indicators (or that have the same subject address, in the case of addressable subjects) are considered to be semantically equivalent to a single topic that has the union of the characteristics (the names, occurrences and associations) of both topics. In a processed topic map a single topic node results from combining the characteristics of the two topics.[7]

23


http://www.topicmaps.org/xtm/1.0/#processing

A subject indicator could be an official, publicly available document (for example, the ISO standard that defines 2- and 3-letter country codes), or it could simply be a definitional description within (or outside) one of the topic maps. A published subject indicator (PSI; originally called “public subject descriptor”) is a subject indicator that is published and maintained at an advertised address for the purpose of facilitating knowledge interchange and mergeability, either through topic maps or by other means.

Published subjects are a necessary precondition for the widespread use of portable topic maps, since there is no point in offering a topic map to others if it is not guaranteed to “match up” with relevant occurrences in the receiver's pool of information resources. Activities are therefore underway, under the aegis of OASIS and others to develop recommendations for the documentation and use of published subjects.[8]

FACETS

Sometimes it is convenient to be able to assign metadata to the information resources that constitute the occurrences of a topic from within the topic map. To provide this capability, the standard includes the concept of the facet.

Facets basically provide a mechanism for assigning property-value pairs to information resources. A facet is simply a property; its values are called facet values. Facets are typically used for supplying the kind of metadata that might otherwise have been provided by SGML or XML attributes, or by a document management system. This could include properties such as “language”, “security”, “applicability”, “user level”, “online/offline”, etc.

Once such properties have been assigned, they can be used to create query filters producing restricted subsets of resources, for example those whose language is “Italian” and user level is “secondary school student”.

It is important not to confuse facets with scope (about which more in the next section). Facets are generally speaking not used to qualify the objects in the “topic domain” part of the topic map (i.e. the topics, topic names and associations). Their purpose is simply to add attributes to information resources. In a sense, facets are orthogonal to the topic map model itself (except to the extent that both facet types and facet value types, like most other things in the topic map standard, are regarded as topics). Despite this, they provide a useful mechanism that complements and significantly extends the power of topic maps.

Note Once the distinction between addressable and non-addressable subjects had

been clarified in XTM, it became clear that information resources could also be subjects (and hence topics). This rendered the concept of facets superfluous since metadata properties can now be assigned to a resource as characteristics of the topic that represents that resource. As a consequence, facets are not part of XML Topic Maps.

24


Figure 8 - Applying facets for filtering

SCOPE

The topic map model allows three things to be said about any particular topic: What names it has, what associations it partakes in, and what its occurrences are. These three kinds of assertions are known collectively as topic characteristics.

Assignments of topic characteristics are always made within a specific context, which may or may not be explicit. For example, if I (yet again) mention “tosca”, I should expect my readers to think of the opera by Puccini (or its principle character), because of the context that has been set by the examples used so far in this paper. For an audience of bakers, however, the name “tosca” has quite other and sweeter connotations: it denotes another topic altogether.

Although we seldom notice it in everyday life, the problem of context is with us all the time. According to [Sowa 1984] a sentence is derived from six different kinds of information, four of which ( tense and modality; presupposition; focus; and emotional connotations) are in one way or another related to context.

Humans are remarkably good at dealing with context. It is that ability that enables them to make sense of two such similar statements as John Smith to marry Mary Jones on the one hand, and Retired priest to marry Bruce Springsteen on the other, or to parse and interpret the two sentences Time flies like an arrow and Fruit flies like an apple.[9]

Computers, however, are not yet that smart. Given two such simple statements as Tosca takes place in Rome and Tosca kills Scarpia, most of today's computers would not be able to infer which of the topics named “Tosca” was involved. In order to avoid this kind of problem, topic maps consider any assignment of a characteristic to a topic, be it a name, an occurrence or a role, to be valid within certain limits, which may or may not be specified explicitly. The limit of validity of such an assignment is called its scope.

Scope is defined in terms of themes, and a theme is defined as “a member of the set of topics used to specify a scope”. In other words, a theme is a topic that is used to limit the validity of a set of assignments. Thus, the name “tosca” might be assigned to three different topics in scopes defined by the themes “opera”,

25


http://www.ontopia.net/topicmaps/materials/tao.html#sowa1984

“opera”+“character”, and “baking” respectively, thereby removing any ambiguity and reducing the chance of errors, for example when merging topic maps.

In fact, the well-designed, consistent and imaginative use of scope in topic maps does much more than simply remove ambiguity. It can also aid navigation, for example by dynamically altering the view on a topic map based on the user profile and the way in which the map is used. For example, any user that declares a specific interest in opera (or a specific lack of interest in baking!) can have the various toscas ranked accordingly.

Similarly, anything that is known about the general background of the user might be regarded as presuppositions that can affect the behaviour of the map. For example, in a topic map devoted to presenting the tourist attractions of a country, scope might be used to qualify topics such that different views on the information were presented to prospective visitors and professional tour operators.

Scope can also be used to dynamically determine which name to use for a topic based on how the topic was arrived at. For example, the association between Tosca (the opera) and Rome might be labelled “takes place in” in the scope of the association role type “action” (the opera) and “setting for” in the scope of the association role type “location” (the city).

Figure 9 - Scoping topic names, occurrences and associations

As mentioned above, scope should not be confused with facets. The two mechanisms are different and complementary. Whereas scope can be seen as a filtering mechanism that is based on properties of the topics, facets provide for filtering based on properties of the information resources themselves.[10]

POWER OF THE TOPIC MAP MODEL

It is sometimes claimed that “everything in a topic map is a topic”. This is almost true, but not quite. Specifically, all types (topic types, association types, occurrence role types, facet types and facet value types) are defined as topics. In addition, scope is defined in terms of themes which are themselves topics.[11]

26



This design gives tremendous power to the model, allowing among other things for the topic map to be self-documenting. Since the ontology of a topic map (the kinds of things it consists of) is defined in terms of topics in the same map, the map can be used to described its own ontology and provide more functionality and flexibility when used for navigation or querying.

It also turns out that because of the power of the topic map model, topic maps can also be used to define the control information used for much topic map processing. The committee that developed topic maps has already coined the term “topic map template” for the declarative part of a topic map (consisting mainly of typing topics), and this is itself a topic map. Current research also indicates that queries on topic maps, schemas for constraining classes of topic maps, and user profiles for interacting with topic maps all can be expressed as topic maps. Finally, interesting work is being done on the use of topic maps for even more esoteric purposes, from the standardized representation of other graph structures, such as social networks, to the management of multiple schema languages.

TOPIC MAP MODEL AMBIGUITIES

The basic assumption behind the Topic Map paradigm seems to be that there are things in the world which are concrete and instrumental or object-like in nature, serving as a complement to some abstract notion manifest in them, and which only exceptionally become themselves a notion to be thought or talked about, the subject of a statement. This conviction then justifies that occurrences are separated from topics. (From paper [1])

Most authors claim that the distinction between topics and occurrence is fundamental and indispensable to Topic Maps and thus to XTM. This distinction is the most prominent exception to the design principle "in topic maps, most things are topics" ([TMRDF]: "3 Comparable modeling power"). Pros like the conviction that only this distinction allows to model ontological data properly (as cited in "Approach") or that it makes metadata more portable to other pools of instance data (see discussion in "Portability") face a couple of cons, which are listed below. Most of them argue that occurrences lack features topics have, and that these defects result only from the distinction between the two concepts.

occurrence characteristics. In the section "occurrence" we have seen an example of how to add features to an occurrence, such as a name or other occurrences. A TM mechanism called facets, devised to avoid the somewhat cumbersome reification of the occurrence, has been dropped in XTM. But even if it had not, the most natural way to achieve a further qualification of occurrences (besides their type) would probably be using occurrence characteristics, as one uses topic characteristics for topics.

associations between occurrences. Since associations only take members which are topics, there is no means to specify that any relationship holds between two occurrences except reification.

occurrence references to non-resources. The content model of the <occurrence> element provides two mechanisms for associating a resource with it: it is either referenced (since resources are machine-addressable by definition) or included inline. However, there is no way to reference a non-addressable subject as an occurrence. To indicate that a citation occurred during a certain speech (for which no audio file is available) or that the

27

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#xtm

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#tm

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch03s18.html#sect_occ

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch04s01.html#sect_rdfport

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch04s01.html#sect_rdfapp

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch06.html#bib_6


http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#tm

concept of "egoism" is manifest in an action, one would probably want to use a subject indicator just like with topics. However, there must not be any <subjectIndicatorRef> element inside an <occurrence> element. In this case, not even reification helps, as in some of the other examples.

inline topic data. On the other hand, one might want to include inline resource data in a topic just like with occurrences. There is a way to reify resources by reference, after all, so why require this asymmetry here? To allow for inline resource data in topics, a <resourceRef> element would be needed, just like with occurrences.

occurrence merging. One way to merge two topics is to create a third one which uses the other two as subject indicators in its <topicRef> elements. This works because the subject-based merge operation prescribes that any two topics sharing a subject must be merged. [XTM10] does not include the idea of merging occurrences. Still it might be useful to specify that two occurrences with different addresses (e. g. different names for the same server or different aliases for the same file) are identical. This could be done using the <resourceRef> element, in analogy to the <topicRef> element used with topics.

COMPARISON WITH RDF

Topic maps and RDF have a number of similarities. They both attempt to alleviate the same general problem of infoglut by applying knowledge representation techniques to information management. They both define abstract models and interchange syntaxes based on XML and both have models that are simple and elegant at one level but extremely powerful at another: In topic maps, most things are topics (not just the topics themselves); in RDF, the value of a resource's property may itself be a resource which in turn has properties of its own.

Below, a comparison of the RDF and Topic Map models is performed based on a number of concepts. (Provided by papers [1], [2], [5])

DIFFERENT ROOTS AND PERSPECTIVES

Topic mapping has its roots in traditional finding aids such as back-of-book indexes, glossaries and thesauri. RDF has its roots in formal logic and mathematical graph theory. Topic mapping is knowledge representation applied to information management from the perspective of humans. RDF is knowledge representation applied to information management from the perspective of machines. This accounts for some of the critical differences between the two.

DIALECTICALLY OPPOSED POINTS OF VIEW

RDF is resource-centric, whereas topic maps are subject-centric. In RDF one starts with information resources and attaches metadata structures to them; in topic maps, the primary focus is the subjects that the information is "about". So in one

28


sense RDF and topic maps have diametrically opposed points of view. (To some extent, this difference in focus parallels that between document languages, such as the Anglo-American Cataloguing Rules, AACR, and subject languages, such as the Library of Congress Subject Headings, LCSH, in the domain of library science.) However, "resource" in RDF and "subject" in topic maps can be regarded as synonyms, since information resources can (also) be "subjects" in topic maps and "resources" in RDF do not have to be addressable information resources – so the difference is dialectical rather than diametrical.

THINGS

The central notion in both RDF and topic maps is that there are things that we wish to make assertions about. Examples of such things may be the person 'Lars Marius Garshol', the company 'Ontopia', and this paper. In topic maps such things are represented by constructs called topics, in RDF by constructs called resources. At heart, these constructs are the same: digital symbols representing some well-defined thing.

In RDF each resource is represented by a single Universal Resource Identifier (URI), which makes it clear exactly what the resource is. If the RDF model is making assertions about documents and files the URI will be that of the document or file, but many RDF models discuss abstract things, in which case the URI is just a symbolic identifier for the resource.

In topic maps, the things in the real world that topics represent are called their subjects. Topics may identify their subjects in several ways, for example by specifying the resource that is the subject of the topic by means of a URI. This corresponds exactly to an RDF resource representing a document or file. Topics can also have subject indicators, however, which are URIs referring to resources that indicate (to a human) what the subject of the topic is. This corresponds exactly to an RDF resource that represents an abstract concept.

Whether one used RDF or topic maps, one would generally assign the same URIs to the three example things. mailto:[email protected] would be a reasonable URI for the resource 'Lars Marius Garshol'. In a topic map this would be a subject indicator, and not a subject address. For Ontopia, http://www.ontopia.net would be the URI, and in a topic map it would be a subject indicator. For this paper http://www.ontopia.net/topicmaps/materials/tmrdfoildaml.html would be the URI, and in a topic map this would be a subject address.

In an RDF model there is no application-independent way of telling whether a resource is abstract or concrete. In topic maps, that can be seen by whether the topic has a subject address or a subject indicator.

RELATIONSHIPS

When we create models of the world which contain things it is because we wish to make assertions about these things, and the main kind of assertion we wish to make is what relationships these things have with each other. RDF and topic maps both provide mechanisms for doing this, but these mechanisms are quite different.

29

http://www.ontopia.net/topicmaps/materials/tmrdfoildaml.html

In RDF resources can be assigned properties through the use of statements. These are simple triples, consisting of the resource being assigned a property, the property type (represented by a URI), and the property value, which can be a literal or a URI. The use of URIs as property values allows statements to express the relationships between things.

For example, to say that I am employed by Ontopia we could use this simple RDF statement: (mailto:[email protected], employed-by, http://www.ontopia.net.).

In topic maps the relationships between things can be expressed using associations. Associations are typed, like RDF statements, and the types are themselves topics. Any number of topics can play roles in an association, and their involvement in the association is defined by their association role type (which is also a topic). The relationship between me and Ontopia is perhaps best represented by an association of the type 'employed-by', where I play the role 'employee', and Ontopia the role 'employer'.

There are three major differences between how RDF and topic maps represent relationships:

• The most obvious is the difference in the structure of the representation. RDF relates one thing to another, while topic maps can relate any number of things, and make it clear what involvement each has in the relationship. It is possible to achieve something similar with RDF, but that requires extra work, both in conceptualization and in implementation.

• Another difference is that in topic maps relationships are inherently two-way. That is, you cannot say that I work for Ontopia without at the same time saying that Ontopia employs me. It is possible to traverse relationships backwards in RDF, and it is also possible to specify inverse properties, but this is not inherent in the way relationships are represented.

• A third, and much more subtle, difference is that there is no way of knowing when an RDF statement is asserting a relationship between two abstract things and when it is saying that the one thing is really a resource that has information about the other, which is an abstract thing. Some statements also assign attributes to things, but it is possible to tell these apart, as they will have literals as objects instead of URIs.

In short, in topic maps you know that what is relationships and what is not, all relationships are two-way, and it is easier to represent complex relationships.

ATTRIBUTES

Another common wish in information modelling is to capture the attributes of the things being modelled. Attributes are pieces of information that may be attached to things, but which are not sufficiently important to be considered things in their own right.

Some examples of attributes of the thing 'Lars Marius Garshol' are: my name, my home page, and my birth date. In RDF, these things are simply properties of the resource 'Lars Marius Garshol' and are encoded using three statements: (mailto:[email protected], name, "Lars Marius Garshol"), (mailto:[email protected], homepage, http://www.garshol.priv.no), and (mailto:[email protected], birthdate, "1973-12-25").

On the contrary, Topic Maps have a concept of occurrences, which are pieces of information relevant to a topic. Occurrences can either be resources external to the

30

http://www.garshol.priv.no/

topic map, which are then represented by the URI of the resource, or they can be strings internal to the topic map. Occurrences are typed, the types being topics.

The natural way to represent my home page would be to give me an occurrence of type "home page", and to set the URI to "http://www.garshol.priv.no". My birth date would become an internal occurrence, of type "birth date", where the value was a string representing the date. My name, however, would not be an occurrence. Topic maps have a concept of topic names (which are really privileged occurrences), and so my name would be represented as a name.

There are three classes of attributes that worth to be discussed separately:• Names can be represented in both topic maps and RDF, but only in topic maps is it possible for software with no knowledge of the schema to know which properties are names. The result is that in any interface topics can be represened by their names, something that requires schema knowledge in RDF. For generic applications this is very useful. • Simple properties are very similar in topic maps and RDF. A string is attached to the thing, and another thing tells you what the relationship of the string to the thing is.• Resources relevant to a thing are indistinguishable from relationships in RDF, both being represented by statements. In topic maps, the fact that the relationship is an occurrence relationship makes it clear that the resource contains more information about the thing. The occurrence type makes it clear what kind of information is found there.

Again topic maps are found to be higher-level than RDF and to contain more explicit semantics. This means both that it is easier to develop generic software for topic maps, and that conceptualization of topic map applications is easier, because some of the work has been done in the standard itself.

KINDS OF THINGS

One of the most important pieces of information one generally wishes to record about things is of what kinds they are. For example, I am a person, while Ontopia is a company. This is very important information, and so both topic maps and RDF provide standardized means of representing it. In RDF, there is a standardized property called rdf:type which is used to represent the instance-of relationship between the class and the instance.

In topic maps this information is part of the model: each topic has a set of classes of which it is considered an instance. The information can therefore be represented directly. Topic maps also have a standardized association type for the class-instance relationship, which means that it is possible to represent this relationship with an association. Since this relationship is so fundamental most topic map implementations represent it as a property of topics, however.

This is in fact the area where topic maps and RDF have most in common, and rdf:type is as good as identical to the topic map notion of class-instance.

CONTEXT

31

It is often useful to be able to attach information about the context of the relationships and attributes of things. This context information may state that "this characteristic is only valid in a certain context", or provide useful metadata about the characteristic. RDF has no notion of such contexts, although by introducing anonymous resources in property assignments context information can be attached to the assignments. This is somewhat awkward, however.

In topic maps, such contexts are known as scopes. Scopes consist of sets of topics which define the context of validity, and can be attached to names, occurrences, and associations. This feature can be quite useful. For example, what if one wishes to make a multilingual information system where information may be available in many languages? In RDF this can be handled by defining separate properties for, say, names and definitions of concepts in each language. This is awkward and obscures the commonalities of names and definitions. It also obscures the fact that the differences between the various properties are one of context, and makes it harder to extend the schema. In topic maps this can be handled by using language as one axis of scope. Names are already first-class constructs to which scope can be attached, and definitions can be made occurrence types.

The main differences between topic maps and RDF in this area are that context is much easier to work with in topic maps, and that generic software can know how contexts are represented in each application. In fact, the Ontopia Omnigator ([Omnigator], is a generic topic map browser, which can analyze the scopes used in a topic map, and allows the user to set a context to be used for filtering the topic map as it is displayed. [screenshot of filter page?]

[Pepper01] provides an in-depth discussion of scope in topic maps, and has much useful information on the applications of scope.

REIFICATION

RDF provides a process, called "reification", whereby an arc can be alternatively represented as a node when it is discovered that someone wants to say something about it. ("Reification" literally means "thing-ification" or "noun-ification" -- transformation into a thing. The term "reification" is derived from the Latin noun "res" (pronounced like "race"), which means "thing".).

In RDF, reification involves changing the graph that results from processing interchangeable RDF statements. In Topic Maps, however, everything is already reified. No existing arcs need be changed when new information comes along. New arcs and nodes are added, and these additions are the only changes that are required. This comparative changelessness can be extremely important. If you find something in a graph, and you make a record of the arcs you traversed in order to find it, you may want to be able to use that same set of arcs to find the same thing at some future date. If some of those arcs disappear, you may not be able to retrace your steps. If, on the other hand, the process of reification does *not* cause the arcs whose functions have been duplicated to disappear, then we have a situation in which a considerable amount of redundant information is contributing to our infoglut problem. Either way, a policy of "late reification" (or maybe we should call it "lazy reification") causes problems for the usefulness of continuously-amalgamated knowledge.

32

ASSOCIATION OCCURRENCE

In [USINGTM], an example is given where it could be useful to associate an occurrence with an association. In the context of the Semantic Web, applications on the inference layer are supposed to process data from the ontology layer, i. e. data suitable to be stored in XTM. If such an inference application had deduced a relationship between two topics, represented by an association, then it would be convenient to associate the resources used in the inference as occurrences of this new association. This is impossible in XTM. Given an appropriate property to model the "occurs" relationship in RDF, the statement representing the newly found association could be reified and assigned any resource as an "occurrence". There are, however, some problems with RDF reification, as mentioned before.

PORTABILITY

In addition to being more appropriate for describing pure ontologies independently of any instance data, Topic Maps are sometimes claimed to be much easier to reuse for different pools of instance data. The idea is that a structured index, e. g. one defining various kinds of media, literary genres, characters and cultural eras in terms of topics and associations, can be used to describe any set of particular films, books, etc. Ontology corporations could use Topic Maps to develop sophisticated domain-specific ontologies which they would sell to people who need to describe and organize a given collection of data in this domain.

Examining the structure of a Topic Map, however, it is not quite clear why Topic Maps should be easier to reuse than RDF, e. g. Reuse is facilitated when instance data are well separated from more abstract data. In the case of Topic Maps, this means occurrences would have to be separated from topics and associations. The latter reference their member topics but do not include them as child elements, so it is easy to keep both apart. <occurrence> elements, on the contrary, are embedded in <topic> elements, so to reuse a set of topics for some other occurrence data one probably needs to keep the topics together without any occurrences, and define a new set of topics to hold the occurrences each time a new collection of instance data is provided. The new topics are then merged with the reusable topic pool.

RDF does not separate instance data from abstract data, or occurrences from topics. The concept of occurrence would be modelled best as a property, i. e. a kind of relationship which holds between the topic and the occurrence. Properties are similar to associations and thus separate well from their members. Therefore reuse of ontological data for different collections of instance data should not be too difficult with RDF.

INTEROPERABILITY

Topic maps were designed from the start for ease of merging. The duality of "subject" and "topic", the concept of subject identity and the ability to establish a topic's identity through a subject address and/or multiple subject indicators are key to this capability. In particular, the notion of published subject indicators (PSIs)

33

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#rdf

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#rdf



http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/apa.html#sw


promotes interoperability across applications. RDF has none of this machinery. However, since PSIs are based on URIs, they are general enough to solve the interoperability problem for both topic maps and RDF – and make it easier to exploit the synergies between the two.

ABBREVIATED SYNTAX

The XML serialization syntax for RDF defines three kinds of syntax abbreviations. Some of them change the syntactic structure of a statement considerably, and it has been shown that the language defined by the abbreviated syntax is not the same as the one defined by the non-abbrevaited syntax, i. e. the two syntaxes are not fully equivalent.

XTM tries to avoid a complicated syntax at the risk of being too verbose, which makes the implementation of processors easier. There is however one abbreviation syntax defined in [XTM10]: the <instanceOf> element is a shortcut for creating an <association> of the type class-instance. Compared to RDF, the impact of this abbreviated syntax is rather small.

NO USEFUL GENERAL MAPPINGS

The models of topic maps and RDF are sufficiently similar that it is possible to define generic mappings between the two in either directions. However, doing so does not yield useful results in terms of the target paradigm. An RDF triple can in theory be mapped to at least six different topic map constructs, but without knowledge of the semantics of the predicate, an optimal choice cannot be made. Likewise, topic characteristics can be mapped generically to RDF triples but without an RDF schema for topic maps the higher level of semantics are lost; and even with such a schema, the results are totally inadequate from the point of view of RDF processing.

USEFUL SCHEMA LEVEL MAPPINGS

At the level of the schema, on the other hand, it is possible to describe two-way mappings that are extremely useful. Once the semantics of a particular RDF predicate are known, the choice of what kind of topic map construct to map it to becomes easy. Similarly, semantics that might otherwise be lost when mapping from topic maps to RDF can be expressed in an RDF schema. This suggests that the chances of unifying the two models in the short term are very slight. The immediate goal should rather be interoperability.

34


PRODUCTS

The products of Ontopia are grouped in a complete set of tools for building, maintaining, and developing topic map-based applications: The Ontopia Knowledge Suite (OKS). An overview of OKS is shown in figure 10:

QueryEngine

Full-textSearch

SchemaTools

EngineCore Interfaces

Topic Map UtilitiesReaders/Writers

In-Memory RDBMS OODB others

Topic Map Storage

Navigator Framework

Omnigator Web EditorFramework

Figure 10 - OKS Overview

The Ontopia Knowledge Suite (OKS) consists of two main tools:

1. The Ontopia Topic Map Engine2. The Ontopia Navigator Framework

These tools are presented in sections 2.1 and 2.2.

Ontopia Topic Map Engine

The Ontopia Topic Map Engine is a topic map development SDK written in Java.

This engine:

35

provides all the functionality needed for building topic map applications manages all the difficult aspects of topic map deployment on behalf of

applications lets applications do all they may want to do with a topic map, like load

topic maps from XML documents, store topic maps in databases, modify and access topic maps etc.

has a core topic map API which all applications use to access topic map data, independently of where those data are stored (i.e. memory, database, virtual view)

Features

The Ontopia Topic Map Engine provides a number of effective features, such as:

1. Full conformance with XTM 1.02. Ability to read and write XML files using XTM, HyTM and LTM (Linear

Topic Map Notation, a compact clear-text syntax for topic maps, defined by Ontopia and useful for rapid prototyping) syntaxes

3. Enterprise-level robustness and scalability 4. Persistent and scalable storage of topic maps in relational databases using

the RDBMS backend 5. Well-designed, consistent, intuitive and scalable API6. Rich set of utilities for topic maps, like merging, association walking,

name selection and scope filtering7. Index API for quick and easy lookup of topic map information 8. Full internationalization, with support for handling of all character

encodings9. Automated test suite for quick and easy testing of the engine in new

environments

Areas of application

The Ontopia Topic Map Engine can achieve development simplification and result improvement in many areas. The most significant of them are:

1. Web portals: Web portals, as big information repositories, are perfect use cases for the Ontopia Topic Map Engine. The topic map model itself can guide to the creation of a well-organized portal. Also, with the use of Ontopia Topic Map Navigator SDK, the development of the actual portal is simplified uniquely.

2. Knowledge base intranets: Topic maps are great for capturing and management of the knowledge that constitutes corporate memory. Ontopia Topic Map Engine can enrich an intranet with topic maps and modify it into a cooperative knowledge base.

3. Content management systems: An important feature of Ontopia Topic Map Engine is that it lets the integration of topic mapping capabilities into content management systems. This integration could take place by making the entire system topic map-based, or by creating a virtual topic map of the system.

36

4. Enterprise Application Integration (EAI): An essential privilege of topic maps is that they automate the merging of information from several sources. This action can take place with the use of Ontopia Topic Map Engine's support for merging, so we can have the integration of the information of a large number of different sources into a supposable whole. This capability of the Ontopia Topic Map Engine can be used to provide a single and united access layer to information stored in dissimilar and incompatible repositories and formats.

Architecture

The architecture of the Ontopia Topic Map Engine, when it is used in conjunction with the Ontopia RDBMS Backend Connector, is shown in Figure 11:

Figure 11 - Ontopia Topic Map Engine architecture

In this architecture: The Topic map applications access the topic map data through a different

implementation of the Core Interfaces. This means that Topic map applications (therefore and Topic map utilities) do not need to know how the data is stored.

Prototypes are allowed to be developed quickly, with the use of topic maps generated by scripts, and stored in XML files.

Ontopia Topic Map Engine complement tools

OKS provides four complement tools for the Ontopia Topic Map Engine till now:

1. The Ontopia RDBMS Backend Connector2. The Ontopia Full-text Search Integration3. The Ontopia Topic Map Query Engine4. The Ontopia Schema Tools

These tools are presented in sections 2.1.4.1, 2.1.4.2, 2.1.4.3 and 2.1.4.4.

37

The Ontopia RDBMS Backend Connector

The Ontopia RDBMS Backend Connector is an add-on to the Ontopia Topic Map Engine that:

Enables the storage of topic maps in relational databases Provides access and modification to topic maps stored in such databases Supports most RDBMS servers (Oracle, PostgreSQL etc.)

With the use of Ontopia RDBMS Backend Connector: Topic map applications can scale to manipulate huge topic maps Topic map applications can get significant benefits, like transactions and

collaboration management between different processes The production quality of the prototypes can be stepped up by switching to

the RDBMS backend and upgrading the generation scripts. The Topic map applications do not need to change, due to their independence of the location that topic maps are stored.

The Ontopia Full-text Search Integration

The Ontopia Full-text Search Integration is a framework for indexing and searching for text in topic maps. It provides a full-text search engine, integrated with the Ontopia Topic Map Engine. The existing integration can easily be customized or updated by further integration of other engines.

This framework: Is a quick and instinctive entry point to the topic map, which allows the

jump directly to the topic we are looking for Can be used to integrate full-text search engines with topic maps. Is highly flexible:

o Allows any search engine to be integratedo Allows applications to customize integration in any way they wanto Provides simple defaults for those who have simpler needs.

Full-text search can be very auxiliary for users new to topic maps that what to find something specific in a topic map, due to the ability of searching for topics by their names and the contents of their occurrences. For example, a user can search for "Philadelphia", and immediately be told that this matches the names of a cheese, a city, a movie, and several songs and musical artists. Thus, this user immediately know what he/she has found, and why, facility that isn't possible when doing full-text search of documents.

The main features of Ontopia Full-text Search Integration framework are:

1. Bundled search engine: Lucene is a Java-based and open source search engine. This engine, which is used here, comes bundled with the integration. Also, a part of the distribution is a plug-in for the Omnigator,

38

which provides a user-friendly interface for searching and result management.

2. Flexibility: The Ontopia Full-text Search Integration is designed around a set of general interfaces, so it is convenient to integrate this framework into any application. Besides, it is easy to integrate new search engines, and to control the way the full-text search works in a particular application.

3. Automated indexing: The Ontopia Full-text Search Integration comprises utilities for automatic indexing of topic maps, using any search engine. These utilities can also index resources external to the topic map, either local or remote.

4. Fully internationalized: The Ontopia Full-text Search Integration is characterized by full internationalization. The search operation in topic maps is not affected by the character encoding they use or the language they are written (English, Norwegian, Russian, Japanese or Arabic).

5. Test suite: This feature can be used by us for testing the product if it is working correctly in our environment.

The Ontopia Topic Map Query Engine

The Ontopia Topic Map Query Engine provides a query language for topic maps (‘tolog’), which is developed by Ontopia. This query engine:

Provides easy development of search interfaces in topic map applications Makes simple the development of applications, since the query language

can be used by application developers to simplify the retrieving information task from the topic maps.

Builds on the Ontopia Topic Map Engine and is integrated with the Ontopia Navigator Framework, so that web applications can use the query language to find the information they wish to display.

The ‘tolog’ query language used now will be replaced by the standard Topic Map Query Language, TMQL, currently in development by ISO.

The Ontopia Schema Tools

The Ontopia Schema Tools provide a schema language (Ontopia Schema Language, OSL) for topic maps.This schema language:

Fulfills the same function with respect to topic maps as XML Schemas and DTDs do for XML documents.

Allows user to do the description and the constraint definition of the structure of the topic map that is used by a particular application. For example, a possible constraint could be that "every person must have a name".

The ability of having a schema that defines the permissive structure of the topic maps used by an application, gives us two important advantages:

39

1. Applications are enabled to verify that the topic maps actually follow the structural rules.

2. Applications are enabled to give more guidance to users than they could otherwise do.

The Ontopia Schema Tools module also comprises a toolkit for importing and exporting schemas, accessing and modifying schemas, and validating topic maps, topics, and associations against a schema. The last action, allows the convenient addition of schema capabilities to any application.

The OSL schema language used now will be replaced by the standard Topic Map Constraint Language, TMCL, currently in development by ISO.

Ontopia Navigator Framework

The Ontopia Navigator Framework is a framework for deployment of Java 2 Platform Enterprise Edition (J2EE) compliant web applications using topic maps. This framework:

Is based on J2EE, using the Java Servlets and Java Server Pages (JSP) technologies

Is built on the Ontopia Topic Map Engine Consists of:

o JSP tag librarieso A Java API

The Ontopia Navigator Framework comprises an XML-based scripting language which is optimised for topic map application development. This language:

Can be used by developers to:o Collect information from the topic mapo Perform complicated functions, like name selection and scope

filteringo Output the results if the format they want (HTML, XML etc.)

Can be combined with Java code, if it is desired

Any application that built with the Ontopia Navigator Framework can be developed in any J2EE container. The development platform used here is the Apache Foundation's Tomcat server.

Features

The Ontopia Navigator Framework provides a number of important features:

Well-designed tag libraries and utilities for convenient web application deployment

40

Simplification of complicated features like reification and scope filtering, with the use of tag libraries

Optimizations, like name caching and object pooling, for providing web sites with high performance and scalability

Model-view-skin architecture for providing flexible applications and more convenient cooperation between designers and developers

Plug-in concept for easy developing, installing, and configuring extension applications

Full internationalization: Topic maps are supported equal in every language they are written (English, Norwegian, Russian, Japanese or Arabic)

Bundled plug-ins: o Statistics printer: This printer gives a statistical overview of any

topic mapo Merge plug-in: Merges any number of topic mapso Export plug-in: Exports topic maps to any syntax, even in the

occasion these maps are virtually merged or stored in a RDBMS o Validate plug-in: Validates topic maps against a schema o Query plug-in: Carries out queries on the topic map using the

'tolog' topic map query language.

Ontopia Navigator Framework complement tools

OKS provides two complement tools for the Ontopia Navigator Framework till now:

1. The Ontopia Topic Map Web Editor Framework2. The Omnigator

These tools are presented in sections 2.2.2.1 and 2.3.

The Ontopia Topic Map Web Editor Framework

The Ontopia Topic Map Web Editor Framework has not jet been released. This tool simplifies the deployment of web applications in collaborative topic map editing. It expands the Ontopia Navigator Framework with support for actions that change the content of the topic map, making convenient the connection of user interface elements to topic map modification actions.

The Omnigator

The most significant client-server application of Ontopia, which comes with the Ontopia Navigator Framework, is the Omnivorous Topic Map Navigator or Omnigator. This application:

Is built on top of the Ontopia Navigator Framework Can display any topic map loaded into it, without any programming or

configuration at all

41

Is designed for easy understanding of topic map concepts Is useful for topic map debugging and building demo applications

The Omnigator is available for download and for on-line testing.

How does it work?

The Omnigator uses a simple client-server architecture based on a standard http protocol:

Server side: A J2EE web application built, using the Ontopia Topic Map Engine and Ontopia Navigator Framework, which runs in the Tomcat web server. This application reads and writes topic maps and generates HTML pages

Client side: A standard web browser which receives the generated HTML pages and displays a view of some part of the topic map. The view is built from the data structures that represent the topic map and is quite rich in links. In every link selection a request is sent to the Server side, resulting in a new set of information exported from the topic map

Browsing the Omnigator

Now we are going to analyze the features of a simple browsing in the Omnigator.

First Page: The Welcome Page

The page presented to the user when he/she starts the Omnigator is the Welcome page, which is shown in Figure 12:

42

Figure 12 - Omnigator’s Welcome Page

In this page, there is a list of available topic maps in XTM, HyTM and LTM format. Also, there is a number of links providing short cuts to topic map information and examples.

Selection of a specific topic map: The Topic Map Page

Now, let’s browse a specific topic map from the list, for example the opera.xtm topic map. The selection of opera.xtm in topic map list leads to a new page that is called the Topic Map Page. This page is shown in Figure 13:

43

Figure 13 - The Topic Map Page of opera.xtm (Ontology view)

The Topic Map Page provides various overviews of the topic map as a whole. These overviews can be selected from the Topic Map Overview list, which consists of:

1. The Ontology view: This view shows the existing “classes” in the topic map. It provides lists of topic types (labelled "Subject Indexes"), association types (labelled “Relationship Indexes”), association role types (labelled "Role Indexes") and occurrence types (labelled "Resource Indexes"). For topic maps that use facets, which are available only in the HyTM syntax, there is also a list of facet types (labelled "Metadata Indexes"). All the components of these types are also topics and selecting one of them makes it the current topic.

2. The Master Index view: This view counts and lists all the topics in the topic map, in alphabetical order, under the heading "Complete subject index".

3. The Themes view: This view lists all topics that are used as themes (i.e. the topics that are used to define scope) in the topic map. These themes are grouped in proportion to the kinds of topic characteristics (i.e. names, variant names, occurrences, associations) they are used to scope.

44

The Topic Page of a topic type

Now, let’s select the topic composer in the Ontology view of opera.xtm topic map, shown in Figure 4. Then, a new page appears which is called the Topic Page and the composer is the current topic. This page is shown in Figure 14:

Figure 14 - The Topic Page of ‘composer’ topic

The Topic Page shows the information held in the topic map about the current topic. The kind of this information depends on the nature of this topic. The main title is always the most appropriate name of the topic, based on the type or types of the topic and the current context. All the base and variant names of the topic are shown, together with the themes that defined their scope. For example, in the topic composer shown in Figure 14, there are three names: one in the unconstrained scope, one in the scope 'Italian', and one in the scope 'Norwegian'.

The topics with which the topic composer is associated are placed under "Related subjects". In Figure 14, the topic composer has only a subclass of association to its superclass, topic musician.

Generally, if the topic is a topic type, there is a list of topics of this type, with heading “Topics if this type”, in the Topic Page. But, if this topic is also an association role type (i.e. topic that defines a class of roles played in associations)

45

then there is also a list of topics what play this association role, with heading “Players of this role”.

Topic composer is such a topic. It defines the topic type ‘composer’ and one of the role types in 'composed by' associations. But, in this topic, these two lists are identical, so Omnigator displays only the first list.

Besides, in Topic Page there is a list with heading “Subject indicators”, which contains any subject indicator given to the topic.

The Topic Page of an individual topic

Now, let’s select the topic Puccini from the topics of type composer in Figure 14. Topic Puccini, which is an individual, is a different kind of topic than composer, so this selection leads us to a new page, shown in Figure 15:

46

Figure 15 - The Topic Page of ‘Puccini’ topic

In the Topic Page of an individual topic the kind of information presented is different. There still exist the main title, the list of names and the "Related subjects" section of the topic associations. Also there are three new sections:

1. The Type(s) section: This section lists the classes (i.e. the topic types) of which the individual topic is an instance. In the topic Puccini case, this section contains the topic type composer.

2. The Metadata section: This section contains occurrences of the topic that belong to the class metadata occurrence (defined by Ontopia as a

47

published subject). In the topic Puccini case, the metadata are Puccini's dates of birth and death.

3. The External Resources section: This section contains all the occurrences of the topic that are external to the topic map. These occurrences are organized by type (i.e. the name of the type being displayed in bold as a heading) and displayed by their address.

The Topic Page of an occurrence type

Now, let's select a topic which is an occurrence type, e.g. in Figure 15 by selecting topic La Boheme and then click on the AZ Opera synopsis link in “External Resources” section. Then, the AZ Opera synopsis occurrence type becomes the current topic and a new page is displayed, shown in Figure 16:

Figure 16 - The Topic Page of the ‘AZ opera synopsis’ occurrence type

48

In this Topic Page, there is a list of topics that have occurrences of this type and a section named “Occurrence instances” with the addresses of all those occurrences.

The Topic Page of an association type

Now, let's select a topic which is an association type, e.g. in Figure 16 by selecting topic Tosca and then click on the death of character link in “Related Subjects” section. Then, the death of character association type becomes the current topic and a new page is displayed, shown in Figure 17:

Figure 17 - The Topic Page of the ‘death of character’ association type

In this Topic Page, except the names and the subject indicators of the death of character topic, there are all the topics that play roles in associations of this type, grouped according to the role they play.

49

Advanced Omnigator Topics

The Omnigator provides a number of advanced topics for quick and convenient administration of topic maps. These topics are presented in this section.

The Manage Page

The Manage Page, shown in Figure 18, is used to control various aspects of the Omnigator's configuration and to load, reload and drop topic maps. This page requires the user to be authorized.

Figure 18 - The Manage Page

50

In this page, under the heading "Registry Items" there is a complete list of all

the topic maps known to the system, based on the paths, extensions and other information provided in the file tm-sources.xml. Some of these topic maps are loaded automatically. Others can be loaded manually by clicking on the Load button. Once loaded, the name of the topic map document is shown in a larger font, and the Load button changes to Drop and Reload buttons. Selecting the name of a loaded topic map takes the user straight to its Topic Map Page.

If the document of the topic map is not well-formed or it contains errors, there will be no loading. In cases like this, the user must fix the errors and then try again.

The Refresh Sources button is used to refresh the list of topic maps available to the system without restarting the Omnigator.

The Plug-ins Page

The Plug-ins section, which is a part of the Manage Page, consists of modules that can be added to Navigator applications to provide additional functionality. Some of those that are shipped with the Omnigator are described in section tade. Selecting the Plug-ins link at the top the Manage Page, the Plug-ins Page is brought up. This page lists the names and descriptions of all the currently installed plug-ins and allows the user to switch them on or off according to his/her needs. The user can also control which plug-ins are to appear on which pages in the Omnigator.

The Full-text Indexing Plug-in

In the Manage Page, next to the Plug-ins section link, there is a link to the Full-text Indexing page. This page, shown in Figure 19, allows the creation of full-text indexes for the user’s topic maps, enabling the search operation.

51

Figure 19 - The Full-text Indexing Plug-in

To create an index for a topic map that has not any (e.g. the country.xtm topic map in Figure 19), the user has just to press the Create index button and the Omnigator will create a full-text index for that topic map. The Omnigator does not index the external occurrences referenced from the topic map being indexed.

If an already indexed topic map changes, the index provided will be out of date. This problem is solved with the use of the Reindex button. With this button, the user can update the index for any topic map whenever this map changes.

The Customise Page

52

Clicking on the Customise button from anywhere in the Omnigator, a new page appears, named the Customise Page. This page is shown in Figure 20:

Figure 20 - The Customise Page

This page allows the user to the select the available options he/she wishes for Omnigator’s three built levels:

1. The Model level: Models control the set of information that is extracted from the topic map and placed in each page. The Omnigator is shipped with three models, which are the "Compact" model (i.e. the default model), the "Basic" model and the "Reification” model.

2. The View level: Views control the visual structure or layout of the HTML pages appearing in the client browser. Only the “No frames” view is shipped with the Omnigator.

3. The Skin level: Skins are Cascading Style Sheets (CSS) that control the styling of a page (i.e. improving its detailed appearance in browsers that support CSS).

The Plug-ins

This section describes some of the plug-ins that are shipped with the Omnigator.

53

The Statistics plug-in

The Statictics plug-in, shown in Figure 21, is a report generator for topic maps. It provides an overview of the map's "vital statistics" and a detailed decomposition of some of its structures. This information often can be used to uncover inconsistencies or other problems with the topic map.

Figure 21 - The Statistics plug-in

The Filter plug-in

The Filter plug-in allows the user to customise his/her view of the topic map by establishing a context within which the scope of topic characteristics is evaluated. This action takes place through the Set Context Page, shown in Figure 22, in which

54

the user specifies the themes he/she is interested in. This page is displayed when selecting the Filter button.

Figure 22 - The Set Context Page

55

The Merge plug-in

The Merge plug-in is available only in the original product and not in the on-line demo. This product allows you to merge a second topic map with the current one. The selection of the Merge button leads to a page with a scroll box from which the user can choose any of the topic maps registered with the system, except the current topic map. After the merging operation the user can browse the resulting topic map.

By checking the Use name based merging checkbox, the user can control topics with the same base names in the same scope. Then, these topics will be merged in accordance with the topic naming constraint. If this checkbox is unchecked, the violations of the topic naming constraint cannot be controlled.

Independently of the Use name based merging checkbox being checked or unchecked, topics will be merged if they have the same subject indicator or subject address. Additionally, duplicate names, associations and occurrences will be removed.

The Export plug-in

The Export plug-in creates a serialisation of the current topic map in XTM or HyTM syntax. It allows the user to either save this topic map directly to a disk file or load it as XML into its browser. Some of the situations in which this plug-in is useful are:

Writing a topic map in different format (e.g. XTM) than the format this topic map is loaded (e.g. HyTM or LTM)

Merging two or more topic maps and want to persist the result Modifying a topic map in some way (e.g. via a plug-in) and want to save

the result

The Reload plug-in

The Reload plug-in accords a useful shortcut (i.e. a Reload button on the Topic Map and Topic Pages) in situations topic map reloading must be often (e.g. in continuously topic map modification). This plug-in can be manually loaded by the user.

The Query plug-in

The Query plug-in allows the user to perform queries on his/her topic map, using the tolog query language. By selecting the Query button, a new page appears with a text entry box in which the user can write his/her query using tolog syntax. This page is shown in Figure 23, where in the text entry box there is a query that finds all the theatres ($B) in which operas ($A) were premiered and the cities ($C) in which those theatres are located, and then returns a list of cities with the number of premieres per city, sorted in reverse order (from highest to lowest).

56

Figure 23 - The results of the above tolog query

The Validate Plug-in

The Validate plug-in allows the user to check the validity of his/her topic map against an Ontopia Schema Language (OSL) schema. This action takes place by selecting the Validate button. Any constraint violations are displayed on a new page.

The Full-text Search Plug-in

The Full-text Search Plug-in allows the user to do simple full-text searches in his/her topic maps. This plug-in searches the names and occurrences of the topics, by using a pre-built full-text index. This index can be created using the Full-text Indexing Plug-in. After the creation of a topic map full-text index, a search box shows up in the plug-ins line.

The results, after the user enters a text to the search box and presses enter, are displayed to new page as a list of links to the topics this search found. This page is shown in Figure 24, for the search text “tosca” in the topic map opera.xtm.

57

Figure 24 - Full-text search results of text ‘tosca’

58

REFERENCES

1. Topic Maps and RDF, http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch04s01.html

2. Steve Pepper: “Ten Theses on Topic Maps and RDF”. Ontopia, 2002.3. Steve Pepper: “ The TAO of Topic Maps”.

http://www.ontopia.net/topicmaps/materials/tao.html.4. XML Topic Maps, http://www.topicmaps.org/xtm/1.0/.5. Lars Marius Garshol: “Topic Maps, RDF, DAML, OIL”.

http://www.idealliance.org/papers/xml2001/papers/html/05-04-04.html.6. Ontopia Home Page, http://www.ontopia.net.

59

http://www.ontopia.net/

http://www.idealliance.org/papers/xml2001/papers/html/05-04-04.html

http://www.topicmaps.org/xtm/1.0/

http://www.ontopia.net/topicmaps/materials/tao.html

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch04s01.html

http://www.pms.informatik.uni-muenchen.de/lehre/seminar/ontology/01ws02/XTM/chunk/ch04s01.html

TOPIC MAPS CONCEPThy566/Deliverables/Excer1/Ontopia.doc · Web viewBy adding the topic/occurrence...

Documents

Transcript of TOPIC MAPS CONCEPThy566/Deliverables/Excer1/Ontopia.doc · Web viewBy adding the topic/occurrence...