Most of today’s information systems are highly heterogeneous and complex. High efforts and costs are put into interlinking systems to let systems communicate to each other and thus overcoming heterogeneity. The semantic web plays a significant role in the way it covers and links knowledge, making the web’s content understandable for machine-to-machine interactions. Hereby, ontologies serve as a technology to cover, infer and verify knowledge and making it available to accomplish a common understanding among participating agents.
This paper describes how ontologies are used in practice to support the overcoming of heterogeneity in information systems. After a revision of basic semantic technologies and standards like OWL and SPARQL we discuss a variety of methods and tools of the semantic web. In more detail, we investigate ontology editors, especially the Protégé tool as a well-established open-source application to create, edit and share ontologies. At last, we discover a variety of practical applications where ontologies are of high use.
Content
1. Introduction
2. Revision of Semantic Concepts
2.1 Knowledge and Semantic Web
2.1.1 DIKW Pyramid
2.1.2 Semantic Web Technology Stack
2.1.3 Description Logics
2.2 Ontologies, XML and RDF(S)
2.2.1 Ontologies
2.2.2 XML
2.2.3 RDF(S)
2.3 OWL and SPARQL
2.3.1 OWL
2.3.2 SPARQL
3. Ontology-based Information Integration
3.1 Methods
3.1.1 Mappings
3.1.2 Ontology Integration Architectures
3.1.3 Reasoning, Inferring, Expert Systems
3.2 Tools
3.2.1 Semantic Web Tools
3.2.2 Ontology Editors
4. Ontology Editor - Protégé
4.1 General Background
4.2 Historical Background
4.3 Features
4.4 Internals
4.5 Building Ontologies
4.6 Critical Appraisal
5. Practical Applications
5.1 Industry Solutions
5.2 Established Ontologies
5.3 Biomedical Science and Protégé
6. Conclusion
7. References
1. Introduction
Enormous costs and efforts arise when it comes to information and data integration of distributed and heterogeneous information systems with the intention to create value. Often it is a challenge to establish a base for a common agreed understanding of the content of knowledge bases. For example, in the field of medicine a tremendous amount of expert terms builds the knowledge physicians have to work with. Those terms are related and have logic interrelations to other terms.
The web has a high potential as a platform to store and share knowledge. The problem is that web content was initially made to be understandable for human users, not machines. Thus, the meaning of terms and phrases is not necessarily clear to non-human agents. In addition, defective system can be the result.
The semantic web plays a role as a web concept where the semantic meaning of the web content is clarified and can be examined and analyzed. A very valuable ability is to infer new knowledge. Ontologies can help in this case as they are formal and explicit representations of concepts with intrinsic logical relationships. The W3C and the open-source community developed a variety of standards to represent ontologies in an expressive, formal and explicit way - also containing semantic relationships. Furthermore, a lot of methods and tools are available to support programmers, scientists and decision makers in the creation of ontologies. Especially, the ontology editor Protégé is a well-established tool to create, edit, visualize and to reason ontologies.
The remainder of this paper is organized as follows: Section 2 revises some of the fundamental semantic concepts necessary to create ontologies in OWL - The Web Ontology Language. Section 3 is devoted to a description of methods and tools for information integration based on ontologies. In section 4 we present the Protégé tool and discuss it’s characteristics in detail. Section 5 gives insight in some major practical applications where ontologies are used. Section 6 concludes the main findings.
2. Revision of Semantic Concepts
Semantic concepts are technologies and standards to describe the content of the web in a machine-readable format enriched by semantic meanings and logical interrelations between terms. In this section we give insight in the definition of knowledge and the interdependencies between semantic web technologies. Furthermore, we define the basics of logics. They are represented in ontologies to infer new knowledge from existing domain concepts. Various technologies are used to build up OWL (Web Ontology Language) and SPARQL (SPARQL Protocol And RDF Query Language) to represent and query knowledge in ontologies.
2.1 Knowledge and Semantic Web
2.1.1 DIKW Pyramid
For the understanding of the semantic web it is necessary to understand what knowledge is about. Regarding to the knowledge pyramid information is raw data extended by meaning, whereas knowledge is information extended by a context. Thus, it is obvious that knowledge can be implied by some information base and a context. Later we will call this approach reasoning (see Chapter 3.1). Figure 1 illustrates an adapted version of the original knowledge pyramid extended by examples [9],[36].
illustration not visible in this excerpt
Figure 1 - DIKW (Data Information Knowledge Wisdom) Pyramid [9]
2.1.2 Semantic Web Technology Stack
The semantic web technology stack (Figure 2) comprises the standards and technologies of the semantic web. According to this the foundation layer of the stack encompasses the standards for symbols and resources as a web platform. URIs (Unique Resource Identifiers) as an identification technology for web content serves using a web protocol like HTTP (Hypertext Transfer Protocol). Sharing of structured information is supported by solutions like XML (Extensible Markup Language). The creation of graph-based data models is done for example with the RDF model (Resource Description Framework) incorporating the URIs. Supported by a strong vocabulary and logical interdependencies OWL (Web Ontology Language) can then serve as the global language of ontologies, with even more expressiveness than RDFS (RDF Schema). On the top logical interdependencies built into the proofing module and OWL guarantees the ability to infer knowledge and the gathering of new relations [36].
illustration not visible in this excerpt
Figure 2 - Semantic Web Technology Stack [28]
2.1.3 Description Logics
Unlike propositional logic which only deals with entire propositions and first order logic which has an inefficient problem solving capability, description logic is expressive enough to represent information with their semantics and offering the logical capability to infer new knowledge [4]. Furthermore, properties are very formal. Reasoning algorithms are well-known. Basic logics in description logics include:
Atomic negations
Concept intersection
Universal restrictions
Limited existing quantification
Nominal
Inverse properties
Cardinality restrictions
2.2 Ontologies, XML and RDF(S)
To understand the later OWL technology it is essential to carefully investigate the properties of ontologies. OWL is constructed by the synthesis of several basic technologies like XML and RDF(S).
2.2.1 Ontologies
As Gruber formulated, an ontology is “an explicit, formal specification of a shared conceptualization” [17]. Thus, it represents concepts and their relationships within a specific domain in a formal and explicit way. As we have already seen ontologies can be modelled and analyzed with technologies of the semantic web technology stack.
Ontologies have several purposes. They are needed to represent a shared common knowledge of a specific domain and facilitate the reuse and analysis of this knowledge. Furthermore, they declare semantics explicitly and enable the knowledge sharing among various agents like software or people. Also, ontologies are helpful to make clear expressive statements.
An ontology consists of classes and their properties, as well as individuals (instances) and semantic relationships [31],[44]. Some of the most used relationships are:
Meronymy (“part of”)
Holonymy (“the whole of”)
Synonymy (“equal”)
Antonomy (“opposite”)
Hyponymy (specialization)
Hypernymy (generalization)
Figure 3 shows an example of an ontology about pizzas. As we can see the ontology helps to clarify what a “cheesy pizza” is. We can identify specializations like “CheesyPizza is a Pizza” as well as special relations like “CheesyPizza hasTopping CheeseTopping”.
illustration not visible in this excerpt
Figure 3 - Exemplary pizza ontology (based on [20])
2.2.2 XML
XML (Extensible Markup Language) is a tag-based meta-language. All elements, attributes and content is defined by named markup tags. The content is always plain string text or other tags with content arranged in a tree structure.
From this point of view XML is marginally useful to build knowledge in some extent since it provides a sharable format, is a widely spread web standard and able to develop markup languages for domains. Ultimately, XML can only describe syntax but no semantics and relations. XML tags are rather meaningless for software agents [31].
2.2.3 RDF(S)
RDF (Resource Description Framework) is a data model to provide the internet with metadata. Its core are triple statements to describe resources and attributes in a simple subject-predicate-object relation whereas the subject is the resource, the predicate the relation and the object a resource or literal.
On the one hand RDF supports formalizing knowledge in the way it describes semantics in a machine-readable, formal, explicit and standardized way. One the other hand it neither offers a format to convey content nor it can describe knowledge on an instance level. There is still a lack in complexity to describe ontologies with RDF. A combination of RDF/XML would lead to a possibility to represent the syntax too. However, it would not be possible to represent class descriptions.
RDFS (RDF Schema) compensates some of the disadvantages RDF struggles with to represent ontologies. It is a domain-neutral, formal schema language which provides a basis structure for classes and their properties. At this point we still miss some features to represent ontologies e.g. advanced logics to infer new knowledge out of existing knowledge [31].
2.3 OWL and SPARQL
The two most widespread standards in the area of the creation of ontologies are OWL and SPARQL. Ontology editors make highly use of them. Other ontology standards are SHOE and OIL (DAML+OIL).
2.3.1 OWL
OWL stands for “Web Ontology Language” and is a W3C standard. It extends the former semantic standards by expressive definitions of classes and properties, as well as semantics based on description logic. There are two OWL versions (OWL 1.0, 2004 and OWL 2.0, 2009) and three sub-languages (OWL Lite, OWL DL and OWL Full) available. The sub-languages differ in that extent that they contain a distinct expressiveness and decidability. OWL is known as the de-facto language of global ontologies. Based on a strong vocabulary and expressive description logic utilization consistency and satisfiability of ontologies can be checked. Furthermore, reasoning and inference of new knowledge is supported [21],[37].
OWL elements include namespaces, an ontology header (with metadata about the ontology), class and subclass definitions, properties and their characteristics, restrictions, maps and individuals. The new vocabulary includes many options to logically combine classes (disjunction, equivalence, complement), restrict relations by cardinalities and define properties of properties through transitivity, symmetry, functional and inverse.
2.3.2 SPARQL
SPARQL is a graph-based RDF query language to query RDF and OWL documents. Query triples do match data triples. As a result of a query a combination of matches will be delivered. The syntax is similar to SQL (SELECT, FROM, WHERE clauses). A SPARQL query consists of a prefix (namespace URIs), a query results clause (results forms, dataset sources and query pattern) and optional query modifiers [41].
3. Ontology-based Information Integration
A variety of methods and tools exist to overcome heterogeneity in information systems with the help of ontologies and the semantic web. Ontologies support the process in a way that they enable the automatic and semantic-oriented interoperability between machines. Supporting methods and concepts are mainly mappings, architectural styles and reasoners. Many tools like semantic frameworks or ontology editors are available to programmers and engineers to convey their concepts into practice [36].
3.1 Methods
Semantic heterogeneity is defined by a diverging semantic interpretation of the meaning that can be concluded when investigating a schema. With the help of ontologies and their logical interdependencies concepts can be inferred by implicit semantics in the schema. With the specification and the shared knowledge of a domain followed by logical inference semantic heterogeneity can be overcome.
Ontologies help to link semantically heterogenic systems as it delivers a vocabulary to describe concepts and relations of formal models. Furthermore, they act as global schemas of the mediation layer, a layer of indirection between users/applications and the data source layer. They offer explicit definitions of terms and relationships to be interpreted accurately from multiple different sources. In addition, ontologies provide a global query schema and verification techniques to guarantee the correctness between multiple sources [1],[36].
[...]
- Quote paper
- Kevin Rudolph (Author), 2015, The Use of Ontologies in Practice, Munich, GRIN Verlag, https://www.grin.com/document/300018
-
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X. -
Upload your own papers! Earn money and win an iPhone X.