International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a...

12
International Journal of Metadata, Semantics and Ontologies NEW TITLE IJMSO 6/6/05 15:45 Page 4 Π7 129

Transcript of International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a...

Page 1: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

International Journal of

Metadata,Semantics andOntologies

NEW TITLE

IJMSO 6/6/05 15:45 Page 4

Π7

129

Page 2: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

v i s i t w w w . i n d e r s c i e n c e . c o m

Executive EditorDr. Miguel-Angel SiciliaUniversity of Alcalá, Spain

Associate EditorsDr. Miltiadis Lytras Athens University of Economics and Business, Greece

Editorial Board Dave Beckett University of Bristol , UK

Gerhard Budin University of Vienna, Austria

Darina DichevaWinston-Salem State University, USA

Prof. Tharam Dillon University of Technology, Australia

Prof. Asuman DogacMiddle East Technical University, Turkey

Jérome Gensel Université Pierre Mendès-France, France

Prof. Jane Greenberg University of North Carolina at Chapel Hill,USA

Masahiro Hori Kansai University, Japan

Prof. Eero Hyvönen Helsinki University of Technology, Finland

Dr. Vipul KashyapClinical Informatics R&D, USA

Atanas Kiryakov Sirma AI EAD Ltd,Bulgaria

William E. Moen University of North Texas, USA

T.B. Rajashekar Indian Institute of Science, India

John F. SowaVivoMind,USA

Prof. Stuart A. Sutton University of Washington,USA

Dr. David Taniar Monash University, Australia

Olga De TroyerVrije Universiteit Brussel, Belguim

Stuart Weibel OCLC Online Computer Library Center, Inc.,USA

Members of the Editorial Board

IJMSO 6/6/05 15:45 Page 6

Π7

130

Page 3: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

66 Int. J. Metadata, Semantics and Ontologies, Vol. 1, No. 1, 2006

Copyright © 2006 Inderscience Enterprises Ltd.

Semantic mining and web service discovery techniques for media resources management

Evangelos Sakkopoulos* Department of Computer Engineering and Informatics, University of Patras, 26500 Patras, Greece E-mail: [email protected] *Corresponding author

Dimitris Kanellopoulos Department of Electrical Engineering and Computer Technology, University of Patras, 26500 Patras, Greece E-mail: [email protected]

Athanasios Tsakalidis Department of Computer Engineering and Informatics, University of Patras, 26500 Patras, Greece E-mail: [email protected]

Abstract: The proposed techniques facilitate semantic discovery and interoperability of web services that manage and deliver web media content. As a test-bed, a web management system is discussed that provides vocational monographs for occupational guidance. The key idea of the proposed web service based architecture is that the WS of participating members share their vocational monographs and related resources, resulting in the overall system capacity increase. Mechanisms and techniques are presented concerning:

• an ‘information desk, which provides an adaptive search user interface

• discovery functions based on semantic representation of the WS’ monographs

• mining into monographs based on their semantic representation

• a metric-based evaluation model for the web monographs

• dynamic ranking of retrieved content list

• consumption mechanisms for the career counsellors’ local WS

• e-payment

• universal usability access.

This analysis can be exploited in the practice of a new Occupational Guidance Policy.

Keywords: web semantics; web service; web mining techniques; vocational monographs.

Reference to this paper should be made as follows: Sakkopoulos, E., Kanellopoulos, D. and Tsakalidis, A. (2006) ‘Semantic mining and web service discovery techniques for media resources management’, Int. J. Metadata, Semantics and Ontologies, Vol. 1, No. 1, pp.66–75.

Biographical notes: Evangelos Sakkopoulos is a PhD candidate at the Computer Engineering and Informatics Department, University of Patras, Greece and a member of the Research Unit five of the Computer Technology Institute. He has received MSc degree with honours and diploma in computer engineering and informatics at the same institution. His research interests include web engineering, web based education, web services, semantic web, web searching, and intranets. He has more than 20 publications to his credit in international journals and conferences in these areas.

Athanasios Tsakalidis has been a Professor in the Department of Computer Engineering and Informatics of the University of Patras since 1993. He completed his diploma in mathematics in 1973 (University of Thessaloniki), his studies, and PhD in informatics in 1980 and 1983, respectively (University of Saarland, Germany). From 1983 to 1989, he worked as a Researcher at the University of Saarland. His main interests are data structures, graph algorithms, computational geometry, multimedia, information retrieval, and bioinformatics. Professor Tsakalidis has authored books, chapters, and numerous publications in international journals and conferences having contributed to the solution of elementary problems in the area of data structures.

Π7

131

Page 4: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

Semantic mining and web service discovery techniques for media resources management 67

Dimitris Kanellopoulos received his diploma in electrical engineering and PhD in electrical and computer engineering from the University of Patras, Greece. Since 1990, he has been a Research Assistant in the Department of Electrical Engineering at the University of Patras and involved in several EU R&D projects (e.g., RACE I and II, ESPRIT). His research interests are in the field of multimedia communication, ATM networking, broadband communications, and web engineering. He has contributed more than 20 publications for international journals and conferences in these areas.

1 Introduction The web offers mainly unstructured and semi-structured natural language data. Ontologies (Gruber, 1993) are an emerging technology that offer a promising infrastructure to cope with heterogeneous representations of web resources. The domain model of an ontology can be taken as a unifying structure for giving information in a common representation and semantics.

Additionally, web services (abbr. WS) are the most promising set of recommendations and standards (W3C, OASIS). They have marked current web engineering methodologies and are ubiquitously supported by IT vendors and users. In short, they are interoperable software components that can be used in application integration and component based application development. As the demand for WS consumption is rising, a series of questions arise concerning the methods and procedures to integrate the most suitable of them with ontological knowledge. Web Service Modelling Ontology (WSMO, 2005) and OWL-S (2005) have been proposed to provide the infrastructure for ontological description of web services. In particular, WSMO is an attempt to standardise the description of semantic web services to solve the integration problem. OWL-S is an ontology that facilitates the design of semantic web services. It can be considered as “a language for describing services, reflecting the fact that it provides a standard vocabulary that can be used together with the other aspects of the OWL description language to create service descriptions”. In this project, we have been working with existing web service technologies, which have been semantically annotated to facilitate discovery and matching. A semi-automatic methodology that allows marking up web service descriptions with ontologies is given in Patil et al. (2004).

On the other hand, the World Wide Web (WWW) as an internet technology has significant implications in providing career information and advice. This is a test-bed paradigm that lies in the intersection of business and education, which are two of the most web technology motivated sectors. In particular, Robinson et al. (2000) have already discussed the potential of the web in guidance provision; Career-related websites ranking from the mere listing job openings to comprehensive systems offering career assessment and counselling. Moreover, the evaluation of these web-based career-related sites is sorely lacking. Oliver’s and Zack’s study (1999) revealed that the overall quality of the most career-related websites was mediocre. There are concrete

methods to help young people develop self-understanding and vocational understanding and provide them with vocational guidance. One method is computer-assisted career guidance systems (CACGs). A CACG system allows a person to go through basic steps in the vocational decision-making process using a computer. In Europe and the USA, such well-known systems such as DISCOVER, SIGIPLUS, CHOICES, and PROSPECT have been developed and utilised in career guidance and vocational counselling since the 1960s (Watts et al., 1991). Most CACGs contribute to the user’s vocational identity, level of career development, and career-decision making self-efficacy (Taber and Luzzo, 1999). Career guidance services are based on vocational monographs, which include important information units for jobs. Paper-based (not digitised) vocational monographs are not well structured and their contents are modified with difficulty. Their information includes mainly text (written in a single-language) and their creation and management are loitered procedures. On the contrary, web vocational monographs tide over the above disadvantages and can constitute a powerful tool for open and distance learning in ‘Occupational Guidance’. Moreover, web vocational monographs contain multimedia data (Kotsiantis et al., 2004).

Current web vocational monographs are not understandable and usable by computers. This results in basic drawbacks: • human processing of web vocational monographs • difficult searching of web vocational monographs info • vocational monographs applications depending on their

hardware and software platforms.

Introducing semantics to web vocational monographs tides the above disadvantages while intelligent mechanisms and techniques can be proposed, which can be exploited in the practice of a new occupational guidance policy.

The rest of this paper is organised as follows: Section 2 presents our motivation and glances at other related work in the semantic web community. Section 3 presents an overview of the proposed mechanisms, while Section 4 defines semantics for web vocational monographs. Section 5 introduces an evaluation model using heurist metrics to facilitate the ranking mechanism. Section 6 presents the design and architecture of the entire management environment. Section 7 presents the evaluation procedure. Finally, Section 8 concludes and presents future research directions.

Π7

132

Page 5: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

68 E. Sakkopoulos, D. Kanellopoulos and A. Tsakalidis

2 Motivation and related work Professional associations with guidance have adopted standards (ACSCI, 2001) for the development of information and communication technologies (ICT). However, most websites that deliver assessments and career information provide no data on the quality of the information delivered on these sites. The career information web services are provided by the occupational guidance centres (OGC) and are based on web vocational monographs, which are not semantically enriched. Hereafter, by saying OGC we mean the information management system of an occupational guidance centre. Moreover, OGC are legacy systems and suffer from limited speed and search capabilities. In addition, they are mainly for human use. A request to the system usually involves more than one interaction with the person on the terminal. Furthermore, by being legacy systems, they become difficult to interoperate with other systems and data sources, both inside the OGC as well as with external resources.

In career guidance, the web services technology (Muschamp, 2004) is an ideal fit, as it offers distributed services capability over a network. The platform and language independent interfaces of web services allow the easy integration of heterogeneous systems. The web services offer mechanisms for describing web vocational monographs, methods for accessing them, and discovery methods that enable the identification of relevant web service providers. The web services technology is a set of standards (XML, SOAP, WSDL, UDDI) that allow web management applications of vocational monographs to talk to each other over the internet. To make web vocational monographs understandable and usable by computers of other OGCs, we must introduce semantics to web vocational monographs. The semantic web will enable web vocational monographs to be accessed by semantic content (Berners et al., 2001). It is envisioned as an extension of the current web where, in addition to being human-readable using WWW browsers, documents are annotated with meta-information. This meta-information defines what the information (document) is about in a way, which is machine processable. The automated processing of web vocational monographs requires that explicit machine-processable semantics are associated with web vocational monographs as metadata, so that they can be interpreted and combined. In this way, we can exploit the web vocational monographs services to their full potential. Introducing semantics to web vocational monographs services brings the following advantages:

Semantically enriched web vocational monographs services handle the interoperability at the technical level; that is, they make vocational monographs applications talk to each other independent of the hardware and software platform. But even for applications interoperating at the technical level, there is still a need for semantic interoperability. This kind of interoperability can be addressed through ontology mapping.

• Semantics can be used for the discovery and composition of web vocational monographs services.

• The main mechanism for service discovery is service registries and semantics can be used for the discovery of web vocational monographs service registries.

Fensel and Bussler (2002) proposed a web services modelling framework (WSMF) that defines a fully-fledged conceptual model for providing semantic web services as a new infrastructure for ework and ecommerce. Semantic web services offer sophisticated capabilities including automated discovery, composition, invocation, and monitoring (McIlraith and Martin, 2003). The EU-IST project ‘Semantic Web-enabled Web Services’ (SWWS) (see http://swws.semanticweb.org/) demonstrated the possibilities of using semantics in defining, searching, combining, and invoking web services.

Sycara et al. (2003) introduced a vision for semantic web services, which combines the growing web services architecture and the semantic web. They proposed the DAML-S as prototypical example of an ontology for describing semantic web services. Wang et al. (2004) gave an overview of the problems that current web services are confronted with and predicted distinct advances in semantic grid services, which have been considered in Roure et al. (2003). The EU-IST project SWAP (see http://swap.semanticweb.org) demonstrated that the power of P2P computing and the semantic web could actually be combined to share and find ‘knowledge’ easily with low administration efforts. Recently, information on semantic retrieval systems has also been proposed such as QuizRDF (Davies et al., 2003) and Spectacle (Fluit et al., 2003).

3 Overview of the environment The proposed system is a generic, dynamic, and interoperable system regarding the management of web vocational monographs. The system facilitates semantic discovery and interoperability of web services that manage and deliver parts of web vocational monographs for occupational guidance. In this environment, the participating service providing members constitute a consortium (named Web-based Occupational Guidance Consortium WOGCon), in which they offer their vocational monographs. The members can maintain individual vocational monographs, while sharing vocational monographs in such ways that administration efforts are low. In addition, vocational monographs’ discovery is simple and effective. The WOGCon principal is responsible for the initial registration and authorisation of any member. A careers counsellor is a member of the Consortium who is authorised for the web vocational monographs’ creation, maintenance, and management. Users submit queries for vocational monographs through a web browser to the information desk of WOGCon main system. The ‘information desk’ is an adaptive web UI sub-system that

Π7

133

Page 6: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

Semantic mining and web service discovery techniques for media resources management 69

presents responses to queries according to the personal preferences and needs of the individual user. The information desk is also responsible to communicate with the corresponding web services to initiate the query and finally to receive and present personalised results. A group of web services comprise the core architecture of the system. In the sequel, these services consume the local web services exposed by every authorised member providing monographs. An overview of the proposed WOGCon schema is depicted in Figure 1.

Figure 1 Web-based occupational guidance consortium (WOGCon) architecture

The key-idea of the proposed web service oriented architecture is that the WSs of participating members share their monographs and related resources, resulting in the overall system capacity increase. Therefore, a larger number of users can be served at a low system cost. The main mechanism to discover web vocational monographs is service registries.

The proposed system is depicted in Figure 2 and includes: • the information desk, which provides an adaptive

search user interface • monographs mining module based on their semantic

representation • discovery mechanism based on semantic representation

of the WSs published by members • application of evaluation model on the web

monographs • dynamic ranking of retrieved content list • consumption mechanisms for the career counsellors’

local WSs.

Additionally, the system includes some value added web services to enable

• e-payment

• Mobile access.

Hereafter, an overview of the web services interconnection architecture is described. This overview serves as a roadmap and the reader may return to it at any time.

Figure 2 Web services interconnection architecture

Users interact with the system through the information desk sub-system, which provides an adaptive search interface for vocational monographs. To provide better personalisation features, user profiles are utilised. To achieve this, users are required to register. Searching for vocational monographs is possible without registering. However, this limits the personalisation capabilities.

As soon as a user describes the vocational monographs that he/she looks for, the information desk initiates the web service discovery mechanism. This mechanism searches for WS based on their semantic representations. A list of matched web services, exposed by the geographically distributed career counsellors, is passed to the semantic mining retrieval system for further refinements. First of all, the membership status of the WOGCon member is validated to verify authorisation access of the main system to the remote web service. In the sequel, matchmaking is performed on the ontological representation of the monographs, provided by the selected web service member providers. The semantic mining WS retrieves the matched vocational monograph media resources and forms a respective content list, which is fed into the ranking service.

The ranking recommendation service presents the results list to the user taking into consideration the vocational monograph evaluation model. The monograph evaluation model is updated based on user feedback dynamically after every single user session that the monograph media resource has used.

In addition to the main WS in the system, a payment add-on service is supported to facilitate possible user subscriptions and WOGCon membership management. Finally, special care has been taken to support extensive usability features for users – students or their family – who visit the system.

Π7

134

Page 7: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

70 E. Sakkopoulos, D. Kanellopoulos and A. Tsakalidis

4 Defining semantics of web vocational monographs

The web services technology uses XML as a standard for exchanging structured data. XML (Goldfarb and Prescod, 2000) represents structured data, but unfortunately, it lacks a standardised way to link them with an abstract, formal description of their semantics. The semantic web builds on XML to overcome some of its fundamental limitations.

Ontology languages like RDFS (RDF Schema) or web ontology language OWL express many of the important semantic information through universally accepted constructs. On the other hand, WSDL does not provide semantic information for vocational monographs. In particular, WSDL provides only the technical specification of the vocational monographs service operations, that is, the types of input and output messages and the URL where the service can be invoked. The information on what the service is doing, the possible properties the service might have, and also meaning carried in its messages are not available through the WSDL descriptions of the services.

We must define the semantics of messages through ontologies. Because of this, another language, RDF (resource description framework) has been proposed. RDF is a standard and, along with a closely related language RDFS (Brickey and Guha, 2000), is a language for expressing semantic metadata.

The objects about which RDF reasons are also referred to as resources. Resources are often WWW pages or part of pages, but in principle can be anything. RDF allows resources to be values.

RDF also allows a form of reification, in which any RDF statement can itself be the object or value of a triple, which means RDF graphs can be nested as well as chained. On the web this allows us to express doubt about (or support for) statements created by others (Davies et al., 2004). RDF’s graph structures are strongly reminiscent of semantic net and conceptual graph formalisms from the field of knowledge representations (Mineau et al., 1993). RDF descriptions of information resources can be stored quite separately from the information itself. Developers using RDFS can define a particular vocabulary for RDF data appropriate to the career guidance domain about which they wish to reason.

The user should be able to locate a vocational monograph through its meaning, independent of what the vocational monograph is called, which language is used, and who the service provider is. To exploit the web vocational monograph services to their full potential, we need ontologies to describe their semantics. Ontology can represent a common, shareable view or understanding of the education domain, which gathers and describes concepts and defines the relationship among them. An ontology has been defined as a ‘specification of a conceptualisation’ (Gruber, 1993), that is, ontologies are effectively ‘concept maps’ describing the objects we wish to reason about a given domain and the relationships, which hold between

those objects. Any web vocational monograph can be based on our proposed ontology ‘vocational monograph’, which is defined as a set of semantic metadata (shown in Figure 3):

• title of the job

• category of the job

• description of the job

• social conditions of the job

• required education for this job

• required competencies for this job

• job’s evolution

• insurance

• salary concerning the job.

VOCATIONAL_MONOGRAPH_id={Owner_id, Title, Category, Description, Social, Education, Req_Competencies, Evolution, Insurance, Salary)

where

Owner id: the owner of the web vocational monograph, Category: the category of the job.

Users want to find vocational monographs, in which their jobs’ profiles are expressed by a certain media type (e.g., video, images, audio, text). For example, students are interested in jobs profiles that are represented by image and graphics, while blind users take a great interest in jobs profiles, which are represented by audio. The above needs can be covered by using the semantic metadata named ‘multimedia content’.

Figure 3 The ontology of the vocational monographs

In another scenario, users want to find vocational monographs that are related to jobs, which are activated in a given geographical region (e.g., Athens). This can be achieved by using the semantic metadata named ‘region of acting’.

Π7

135

Page 8: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

Semantic mining and web service discovery techniques for media resources management 71

Finally, in another scenario, users having a specific education in a certain level (e.g., low, medium, high) want to find jobs, which require these qualifications and simultaneously their salaries are greater than a certain amount of money. This question can be answered by using the following semantic metadata: kind of education, level of education, and salary. For example, a question may be:

Kind of education = ‘English as a foreign language’, level of education = ‘High’, and Salary> 1,100 euros.

5 Evaluation model Different users may have different opinions on the same WS media part of a vocational monograph. To depict this notion, an evaluation model is introduced. Afterwards, this evaluation model is used to rank on the adaptive search interface, in which the web media content was proposed by the semantic mining. In particular, the performance of a vocational monograph VMid is defined as a number G(VMid) given by the following equation:

1( ) item

m

id i jj

G VM c=

= ×∑ (1)

where, m is the limited number of the vocational monograph’s items ci is the coefficient weight for the itemi, of the vocational monograph

1(1)n

iic

==∑

For example, the vocational monograph’s item called ‘required competencies’ may have greater coefficient weight than others. If all coefficients ci are equals, then ci = 1/m.

Assuming that j ∈ [1,4], j ∈ N, we consider four media types as eligible (text, images, sounds, and video).

4

1

item j ij ijj

k g=

= ×∑ (2)

where

0, if media item was not consumed1, if the media item was consumedij

jg

j

=

U: the number of users, which consume the multimedia content of the itemj

Assuming that up to now U users have consumed itemj we have:

1

1 U

ij ijk kU λ

λ =

= ×∑

where

0, if media item was not consumed by user 1, if the media item was consumed by user ij

jk

jλλλ

=

The kij is a measure of trustworthiness for the media item kij. It mainly depends on end-user’s experiences of using this item.

6 Design and architecture The proposed architecture enables users to receive web vocational monographs at any time at their own convenience. It supports a multilingual interface to inform individuals, who use different languages. The proposed system has been outlined in Section 3 and includes:

• the information desk, which provides an adaptive search user interface

• discovery mechanism based on semantic representation of the WS monographs

• mining into monographs based on their semantic representation (discussed in Section 4)

• evaluation model for the web monographs (discussed in Section 5)

• dynamic ranking of retrieved content list

• consumption mechanisms for the career counsellors’ local WSs

• e-payment

• care for universal usability access.

In the following sections, the above items are presented in details.

6.1 Information desk: an adaptive search interface

A profile records information concerning the activities and the knowledge state of a user (Beck et al., 1996). This information is vital for the system’s operation according to the user’s needs and preferences. Implicit collection of user actions is used to form the profile, as it requires minimum user involvement. However, the nature of the web imposes certain constraints on the system’s perception of the user. In the environment at hand, it has been chosen to keep track of the users’ traversal path for every visit, their queries, and the chosen media resources of the available monographs. The environment is personalised in two axes according to the multi-level profile introduced in Garofalakis et al. (2002). First, adaptation is performed separately per user activity (query submission, media type choices, semantic area of chosen monographs, etc.) and subsequently, common knowledge of the different activities is shared to provide collaborative personalisation.

To support a range of different users that access the system, certain usability issues have been taken into consideration. Usability (Nielsen, 2000) is sometimes tied to meeting the needs of users, who are disabled or work in disabling conditions. The adaptability needed for users with diverse physical, visual, auditory, or cognitive disabilities is likely to benefit users with differing preferences, tasks, hardware, etc. Plasticity of the interface and presentation with independence of the contents both contribute to universal usability. First of all, the system facilitates access through low bandwidth connection conditions supporting text only interface. Moreover different quality, size, and

Π7

136

Page 9: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

72 E. Sakkopoulos, D. Kanellopoulos and A. Tsakalidis

dimension criteria for the media resources can be passed to the discovery mechanisms. In this way, appropriate media resources’ version is delivered, where available by the counselling member WS, to the special needs and preferences of the users.

6.2 Efficient web service discovery mechanism based on semantics

Web service discovery is a complex issue that has received a number of different approaches, though it is still in its infancy; a compiled review on WS discovery can be found in Garofalakis et al. (2004). In this case, web services providing media resources for vocational monographs are discovered in a service registry based on the UDDI standard. The adopted approach has been initially introduced in Makris et al. (2004). This mechanism performs efficient selection of the appropriate web service instance in terms of quality and performance factors, such as the execution and response time at the moment of the web service consuming attempt. It adaptively selects the most efficient WS among possible different alternatives with real-time, optimised, countable factors-parameters.

To further enhance the search results, we incorporated into this algorithm new metrics, which take into consideration the semantics of web vocational monographs stored in the local member UDDI registries. The semantics used are a limited subset of the ontology proposed in Section 4. These enhancements are integrated into the technical model (tModel) of UDDI. In the UDDI standard, one of the intriguing entities modelled is the technical model or tModel, which can be created for any piece of relevant technical information. Among the meaningful elements of a tModel are its key and its overviewURL field. The tModel key is usually auto generated by the UDDI implementation and is a universally unique identifier. Once a tModel key is generated, it can be considered well known. The overviewURL field should be set to point at the actual specification that a tModel represents. tModel keys provide the ability to describe compliance with taxonomies, ontologies, or controlled vocabularies. Therefore, when tModel keys are assigned to the nodes of the ontologies and the services have the corresponding tModel Keys in their category bags, it is possible to locate services conforming to the semantics given in a particular node of this ontology. This issue has been elaborated in Dogac et al. (2002).

Overall, we introduce and utilise the combination of an efficient WS discovery approach (Makris et al., 2004) together with semantic representation information stored in the WSs tModel. To obtain the semantic information from UDDI registries, the tModel keys and the category bags can be used to associate semantic with web vocational monographs. To be able to associate the semantics of the vocational monograph, we first obtain a tModel key for each node of vocational monographs from a UDDI registry. Then, web vocational monographs are advertised by putting the tModel keys of related ontology nodes in these category bags.

The generic service class name is the input for discovering services according to their functionality. In the sequel, it is necessary to query the ontology to obtain all the subclasses of the given generic class as well as the ‘implementation’ instances. Therefore, the discovered implementation instances satisfy the required functionality. To differentiate between the ‘generic’ type services in the ontology from the service instances, we use DAML-S ‘serviceType’. ‘MonographDeliveryService’ inherits both from the ‘Service’ specification of DAML-S and from the ‘Virtual Product’ specification. ‘MonographDeliveryService’ class restricts the DAML-S ‘serviceType’, and defines the type of the service to be a ‘generic’ type or an ‘implementation’ instance, as shown in the following Figure 4. An ‘implementation’ instance gives the description of a particular service implementation.

Figure 4 An example of ontology description

6.3 Ranking final recommendation list

In Section 5, the evaluation model for monograph media resources has been introduced. In this Section, the final ranking mechanism of the recommended items is presented. To produce and present a results list to the user, the ‘performance’ (equation 1) of the vocational monographs is taken into account. Together, web usage mining is utilised.

Web usage mining is the application of data mining techniques to web clickstream data to extract usage patterns (Cooley, 2003). The analysis of computer log files can lead to several different re-organisation algorithms of enormous value for educators such the ones presented in Christopoulou et al. (2003). The analysis of users’ patterns from navigation history by web usage mining can shed light on user navigation behaviour and the efficiency of the models used in the online occupational guidance process. The proposed system supports users’ patterns interpretation regarding their navigation habits within web vocational monographs. Every single request that a web server receives is recorded in an access log mainly registering the origin of the request, a time stamp, and the resource requested, whether the request is for a web page containing a web vocational monograph. The web-mining module performs • data gathering and pre-processing: for filtering and

formatting the log entries • pattern discovery: to discover relevant and potentially

useful users’ patterns • pattern analysis: to interpret the patterns discovered.

Π7

137

Page 10: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

Semantic mining and web service discovery techniques for media resources management 73

The proposed retrieval system uses a ranking function Φ(ρ(γ(p)), σ(p)), a combination of similarity function σ(p) of the media resources previously recorded in the user’s profile and the monograph media resource performance ρ(γ(p)), of the monograph where the media p belongs. For the purposes of our experiment we set:

( ( ( )), ( )) ( ( ( )) ( )) / 2.p p p pρ γ σ ρ γ σΦ = + (3)

After each query the results are ranked according to equation (3) and the results list is presented to the users recomputed after every query submitted.

6.4 Consumption mechanisms for the career counsellors’ local WSs

Web services (as defined by WebServices.org) are encapsulated, loosely coupled contracted functions offered via standard protocols where: ‘encapsulated’ means the implementation of the function is never seen from the outside. ‘loosely coupled’ means changing the implementation of one function does not require change of the invoking function. ‘Contracted’ means there are publicly available descriptions of the function’s behaviour, how to bind to the function as well as its input and output parameters.

Here the SOAP web services are used. A SOAP web service introduces the following constraints:

• except for binary data attachment, messages must be carried by SOAP

• the description of a service must be in WSDL.

A SOAP web service is the most common and marketed form of web service in the industry. Some people simply collapse ‘web service’ into SOAP and WSDL services. SOAP provides “a message construct that can be exchanged over a variety of underlying protocols” according to the SOAP 1.2 Primer (W3C SOAP, 2003). In other words, SOAP acts like an envelope that carries its contents. One advantage of SOAP is that it allows rich message exchange patterns ranging from traditional request-and-response to broadcasting and sophisticated message correlations. There are two flavours of SOAP web services, SOAP RPC and document-centric SOAP web service. SOAP RPC web services have been adopted, because of their broader acceptance to developers’ community.

6.5 e-Payment and intellectual property

A value added service is supported by the system to handle subscriptions of users and membership payment of WOGCon counsellors. Any authorised user can be connected to the system and be informed about a provided vocational monograph. The user pays only if a counselling process for a vocational monograph is made via the synchronous or asynchronous counselling subsystems. A special payment service that relies on an authorisation service (Kerberos) and supports the credit-debit model is

provided to the users. According to this model, users will maintain small accounts on a payment server and will authorise charges against those accounts. The system’s providers are concerned that once they have made web vocational monographs available, copies will propagate and their contents will be deteriorated. Some researchers (Scott and Sharp, 2002) have proposed elaborate technical mechanisms to prevent this. Much of the technology needed to protect the network infrastructure of the proposed system already exists. Cryptographic techniques can be applied in support of authentication, authorisation, integrity, confidentiality, assurance, and payment (Wales, 2003). Another semantic web architecture for context-awareness and privacy was presented in Gandon and Sadeh (2004). A key element of this architecture is its e-Wallet, which supports the automated discovery and access of a user’s personal resources subject to user’s specified privacy preferences.

7 Evaluation procedure To determine the quality of the proposed environment, we follow two main phases: verification and validation of the outcomes. Verification refers to building the environment right, i.e., substantiating that a system correctly implements its specifications; while validation refers to building the right system, i.e., substantiating that a system performs with an acceptable level of accuracy. During verification phase, two expert software architects test the proposed environment. For this purpose, test cases are performed. As the goal of these tests is to achieve a good cross sectional test of the system, test scenarios are designed to cover a broad range of potential user inputs (common as well as rare cases).

In the sequel, among the validation methodologies available, we opted for qualitative validation mainly by potential end-users. As in the evaluations of TomEx-UFV (Pozza et al., 1998), a follow-up validation is conducted in a class of students. During the validation process, particular attention is paid to the system performance in carrying out content distribution on time. Students are provided with selected cases, which were previously identified in the laboratory. The primary objectives were to assess both user interface and environment accuracy. The students are asked to score in a table-like questionnaire the following criteria of the expert system: understanding and user friendliness. The students assessed each of these criteria as acceptable, poor, and unacceptable. The system rating is based on 0–10 scale, which corresponds to the following verbal continuum of responses (0–4 = unacceptable; 4.1–6 = poor; 6.1–10 = acceptable). In the current development step, the environment overall is rated as acceptable, with 7.1 for understanding and 7.9 for user friendliness. The students’ feedback has validated that the design approach of the information desk (in which queries for monographs are submitted) facilitates the searching process effectively and it is also user friendly. A feedback module has also been

Π7

138

Page 11: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

74 E. Sakkopoulos, D. Kanellopoulos and A. Tsakalidis

attached to the system to record the opinions of the career counsellors, which can be exploited when the system will be deployed in a number of occupation guidance centres.

8 Conclusion and future work The proposed semantic mining and web service discovery techniques for media resources management facilitate the development and utilisation of integrated semantic web services, which serve the aims of occupational guidance. It is a novel proposal that allows vocational monographs’ web services to share their media resources using semantic representation to the end-user (i.e., e-counselee). The proposed environment is a paradigm of ontological engineering and web services architecture, which brings important benefits to the occupational guidance:

• web vocational monographs services will provide interoperability both with internal and external applications

• vocational monographs’ development time will be reduced by wrapping already existing applications as web services

• the provided web vocational monographs will be evaluated and enhanced; using the data-mining techniques.

These techniques entail better integration of evaluating web vocational monographs, enhanced evaluation reliability and validity, and the saving of e-counsellor’s time.

Future work includes refinements on the ontology proposed for the representation of the vocational monographs. The aim is to ease the pragmatic heterogeneity of the web services that provide media resources for them, which hides behind development different of domain-specific processes and different conceptions regarding the support of domain-specific tasks. Furthermore, an interesting direction is to incorporate in the techniques the online QoS parameters of the web services chosen to deliver the media resources. Finally, integration of advanced portal services can provide extra information pools for the end-user, such as counselling via conferencing facilities and QoS streaming delivery of media resources.

Acknowledgement The authors would like to thank the unknown referees for their constructive remarks and suggestions.

References Association of Computer-based Systems for Career Information

(2001) Handbook of Standards for the Operation of Computer-based Career Information Systems, Retrieved: 3 July 2004, Available at: http://www.acsci.org/standards2.html.

Beck, J., Stern, M. and Haugsjaa, E. (1996) ‘Applications of AI in education’, ACM Crossroads, Vol. 3, No. 1, pp.11–16.

Berners-Lee, T., Hendler, J. and Lassila, O. (2001) ‘The semantic web’, Scientific American, Vol. 285, No. 5, pp.34–43.

Brickey, D. and Guha, R.V. (2000) RDF-S, W3C, Available at: http://www.w3.org/TR/2000/CR-rdf-schema-20000327.

Christopoulou, E., Garofalakis, J., Makris, C., Panagis, J., Psaras-Chatzigeorgiou, A., Sakkopoulos, E. and Tsakalidis, A. (2003) ‘Techniques and metrics for website reorganization’, Journal of Web Engineering, Rinton Press, Vol. 2, Nos. 1–2, pp.090–114.

Cooley, R. (2003) ‘The use of web structure and content to identify subjectively interesting web usage patterns’, ACM Transactions on Internet Technology, Vol. 3, No. 2., pp.93–116.

Davies, J., Weeks, R. and Krohn, U. (2003) ‘QuizRDF: search technology for the semantic web’, Towards the Semantic Web, Wiley & Sons Ltd., pp.133–144, ISBN 0-470-84867-7.

Davies, N.J., Fensel, D. and Richardson, M. (2004) ‘The future of web services’, BT Technology Journal, Vol. 22, No. 1, pp.118–130.

Dogac, A., Laleci, G., Kabak, Y. and Cingil, I. (2002) ‘Exploiting web services semantics: taxonomies vs. ontologies’, IEEE Data Engineering Bulletin, Vol. 25, No. 4, pp.10–16.

Fensel, D. and Bussler, C. (2002) ‘The web service modelling framework’, E-Commerce Research and Applications, Vol. 1, No. 2, pp.113–137.

Fluit, C., Horst, H., Meer, J., Sabou, M. and Mika, P. (2003) ‘Spectacle’, Towards the Semantic Web, Wiley & Sons Ltd, pp.145–159.

Gandon, F. and Sadeh, N. (2004) ‘Semantic web technologies to reconcile privacy and context awareness’, Journal of Web Semantics, Vol. 1, pp.241–260.

Garofalakis, J., Panagis, J., Sakkopoulos, E. and Tsakalidis, A. (2004) ‘Web service discovery mechanisms: looking for a needle in a haystack?’, International Workshop on Web Engineering, Hypermedia Development & Web Engineering Principles and Techniques: Put them in use, in conjunction with ACM Hypertext 2004, Santa Cruz, 10th August.

Garofalakis, J., Sakkopoulos, E., Sirmakessis, S. and Tsakalidis, A. (2002) ‘Integrating adaptive techniques into virtual university learning environment’, Proceedings of the IEEE International Conference on Advanced Learning Technologies, pp.28–33.

Goldfarb, C.F. and Prescod, P. (2000) The XML Handbook, 2nd ed., Prentice Hall.

Gruber, T.R. (1993) ‘A translation approach to portable ontologies’, Knowledge Acquisition, Vol. 5, No. 2, pp.199–220.

Kotsiantis, S., Kanellopoulos, D. and Pintelas, P. (2004) ‘Multimedia mining’, WSEAS Transactions on Systems, Vol. 3, No. 10, pp.3263–3268.

Makris, C., Panagis, J., Sakkopoulos, E. and Tsakalidis, A. (2004) Efficient Search Algorithms for Large Scale Web Service Data, Technical Report RA Computer Technology Institute No 2004/09/01, Patras, Greece.

McIlraith, S. and Martin, D. (2003) ‘Bringing semantics to web services’, IEEE Intelligent Systems, Vol. 18, No. 1, pp.90–93.

Mineau, G., Moulin, B. and Sowa, J. (1993) ‘Conceptual graphs for knowledge representation’, LNCS in AI, Vol. 699, pp.294–301.

Muschamp, P. (2004) ‘An introduction to web services’, BT Technology Journal, Vol. 22, No. 1, pp.9–18.

Nielsen, J. (2000) Designing Web Usability: The Practice of Simplicity, New Riders Publishing, Indianapolis.

Π7

139

Page 12: International Journal of Metadata, Semantics and Ontologies...emerging technology that offer a promising infrastructure to ... for WS consumption is rising, a series of questions arise

Semantic mining and web service discovery techniques for media resources management 75

Oliver, L.W. and Zack, J.S. (1999) ‘Career assessment on the Internet: an exploratory study’, Journal of Career Assessment, Vol. 7, pp.323–356.

OWL-S (2005) Available at: http://www.w3.org/Submission/ OWL-S.

Patil, A., Oundhakar, S., Sheth, A. and Verma, K. (2004) ‘METEOR-S web service annotation framework’, Proceedings of WWW 2004 Conference, pp.553–562.

Pozza, E.A., Maffia, L.A., Braga, J.L., da Silva, C.A.B. and Cerqueira, F.G. (1998) ‘Development and evaluation of an expert system for diagnosis of tomato diseases’, Proceedings of the 7th International Conference on Computers in Agriculture, Orlando-FL, October, pp.431–437.

Robinson, N.K., Meyer, D., Prince, J.P., McLean, C. and Low, R. (2000) ‘Mining the internet for career information: a model approach for college student’, Journal of Career Assessment, Vol. 8, pp.37–54.

Roure, D., Jennings, N. and Shadbolt, N. (2003) The Semantic Grid: A Future e-Science Infrastructure, Available at: http://www.semanticgrid.org.

Scott, D. and Sharp, R. (2002) ‘Developing secure web applications’, IEEE Internet Computing, Vol. 6, No. 6, pp.38–45.

Sycara, K., Paolucci, M., Ankolekar, A. and Srinivasan, N. (2003) ‘Automated discovery, interaction and composition of semantic web services’, Journal of Web Semantics, Vol. 1, pp.27–46.

Taber, B.J. and Luzzo, D.A. (1999) A Comprehensive Review of Research Evaluating the Effectiveness of Discover in Promoting Career Development, ACT Research Report Series 99-3, ACT, Inc., Iowa City, IA.

W3C SOAP 1.2 Primer (2003) Available at: http://www.w3.org/ TR/2003/REC-soap12-part0-20030624/.

Wales, E. (2003) ‘Web services security’, Computer Fraud and Security, Vol. 18, No. 1, pp.15–17.

Wang, H., Huang, J., Qu, Y. and Xie, J. (2004) ‘Web services: problems and future directions’, Journal of Web Semantics, Vol. 1, pp.309–320.

Watts, A.G., Kidd, J.M. and Knasel, E. (1991) ‘PROSPECT(HE): an evaluation of user responses’, British Journal of Guidance and Counselling, Vol. 19, No. 1, pp.66–80.

Web Service Modelling Ontology (2005) Available at: http://www.wsmo.org.

Π7

140