Mobile Information Systems

Advances in Mobile Networking for IoT Leading the 4th Industrial Revolution

Research Article | Open Access Volume 2018 | Article ID 1359174 | https://doi.org/10.1155/2018/1359174 Show citation

Methodology for Automatic Ontology Generation Using Database Schema Information

JungHyen An 1 and Young B. Park Academic Editor: Jeongyeup Paek Received 14 Dec 2017 Accepted 04 Mar 2018 Published 02 May 2018

Abstract

An ontology is a model language that supports the functions to integrate conceptually distributed domain knowledge and infer relationships among the concepts. Ontologies are developed based on the target domain knowledge. As a result, methodologies to automatically generate an ontology from metadata that characterize the domain knowledge are becoming important. However, existing methodologies to automatically generate an ontology using metadata are required to generate the domain metadata in a predetermined template, and it is difficult to manage data that are increased on the ontology itself when the domain OWL (Ontology Web Language) individuals are continuously increased. The database schema has a feature of domain knowledge and provides structural functions to efficiently process the knowledge-based data. In this paper, we propose a methodology to automatically generate ontologies and manage the OWL individual through an interaction of the database and the ontology. We describe the automatic ontology generation process with example schema and demonstrate the effectiveness of the automatically generated ontology by comparing it with existing ontologies using the ontology quality score.

1. Introduction

An ontology is a model language that can build models, which support the conceptual integration of the distributed domain data and the inference of relationships among the concepts as a result of activities such as concept analysis and domain modeling using the standard methodology [1]. In particular, the importance of ontology is recognized in areas such as knowledge engineering, context awareness, knowledge integration, and knowledge management and modeling.

When an existing ontology cannot be reused, it needs to be newly developed. The process of developing an ontology involves creating attributes and constraints, creating a model, and applying it to domain data [2]. This process is like designing the requirements of a software architecture. As with software development, ontology development needs to discuss domain concepts, relationships, and constraints with domain experts [3–6].

Since this process consumes a lot of manpower, methods to automatically define an ontology model by defining a domain in the form of the metadata that can characterize the domain and apply rules to the metadata are currently studied. Yahia et al.’s work automatically generates ontologies based on XML data sources [7]. The following studies, including Dey et al., conceptually classify fuzzy data and describe how to generate an ontology and the rules to generate an ontology [8–10]. The Clonto Framework automatically generates an ontology through a suffix tree clustering algorithm in a document that describes the domain information [11].

Methodologies to automatically generate an ontology through metadata must preprocess the metadata for generating an ontology through a domain into a template for applying an ontology-generating rule [12]. The generated ontology model does not focus on how to manage when many individuals occur. Individual inputs into the generated ontology model can be stored in a table in one of the databases in a triple form that consists of an object and a subject. Using this approach, it is possible to provide efficient management and query functions for individuals of the corresponding schema [13, 14]. Individual is the basic component of an ontology. The role of individuals in an ontology is to classify objects according to their class, which is the concept of a domain [15]. Individuals in OWL correspond to constants in first-order logic and instances in the Resource Description Framework.

In this paper, we propose a methodology to automatically generate an ontology model based on the database metadata and convert it into a database tuple when many individuals occur in the generated ontology. This methodology reads an OWL-DL-level ontology based on the schema information, which is the metadata of the relational database, and converts the individual of the ontology into a relational database.

A relational database is one of the common methods to structurally store data in a domain [16]. As a result, the schema of the database storing the domain data has characteristics of the corresponding domain. In addition, a database table is a conceptual model that can contain similar data in the domain. As a result, the methodology to generate an ontology from a built database has the advantage that the generated ontology can better express the characteristics of each domain region for a wider range of domain regions.

The ontology quality metric was applied to determine whether the automatically generated ontology through the database schema was sufficient for actual domain applications. A good ontology is impossible to evaluate because an ontology has different characteristics depending on the applied domain, but it is possible to determine how suitably the ontology fits into the domain [17]. In this paper, we define a metric of how well an ontology can reflect the domain knowledge, compare the automatically generated ontology according to our process to the other ontologies, and show the effectiveness of the automatic ontology-generating method using the database schema.

The remainder of the paper is organized as follows: Section 2 introduces existing papers on the data construction for automatic ontology generation and individual management. Section 3 describes the automatic ontology generation process. Section 4 describes the process of managing an ontology individual using a database management system. Section 5 shows the process and results of automatic ontology generation using the sample database schema. In Section 6, the ontology quality score is used to verify how the ontology expresses the domain by comparing it with other ontologies. The final section concludes with a discussion of future research.

2. Related Works

2.1. Automatic Ontology Generation Using Metadata

Frameworks such as TANGO [18] and TARTAR [19] automatically generate an ontology from the metadata that contain the structure and characteristics of the domain data. In the framework, the commonly found components in the data are organized in a tabular form, and the table is analyzed to generate the components of the ontology model. In the TANGO application, a table is analyzed, a semiontology is generated based on each table, and a semiontology is connected to generate a kernel ontology to finally generate an ontology. TANGO supports functions such as multiple-source query processing, semantic web creation, and superimposed information generation to use application. TARTAR automatically transforms tabular data such as HTML, PDF, and EXCEL into a formal (structural and semantic) template and provides it to users through an internal engine. At this time, a table attribute ontology of the OWL format linked to each table data is automatically generated.

Long [20] has realized an agent that interprets table data by recognizing tabular data and each table attribute and generating an ontology of the RDF format to realize an agent-based approach methodology for table recognition and interpretation. The study explains how to extract these tables from text files, evaluation of table analysis tasks, and the Table Analysis Framework based on the RDF. Among them, the RDF-Based Blackboard Framework generates an RDF file through the annotation of different printed tables and analyzes the table through the generated RDF.

The following studies on automatic ontology generation based on the relational database define each component of the database and ontology as a notation and generate the ontology based on the database through the rule using the relation of each component [21, 22]. These researchers used the Jena Framework to read and analyze the metadata of a database in a program, which was written in the Java language, and applied the rule to create an ontology model. As a result, they used Jena to implement the ontology model and generated documentation and RDF graphs.

Alalwan et al. [23] explained the overall process and rules to automatically generate an OWL ontology from a database schema to merge the data from each database using ontologies in a distributed database environment. The rules applied to the automatic creation of ontologies in the paper are based on the rules of this study; they are integrated and generalize the conditions of the rules. In a study, the rule for class fragmentation related to a layering of the class generated by referring to a database table is defined in the following formula: