Breaking through the glass ceiling of NLU with Upper Ontology

Original article was published by Constantin Kogan on Artificial Intelligence on Medium

Breaking through the glass ceiling of NLU with Upper Ontology

Natural language is full of ambiguity and specific jargon. When analyzing and trying to understand the meaning of written or spoken language, current methods of natural language processing (NLP) and natural language understanding (NLU) are far from perfect. Despite the fact that Open AI has achieved amazing results with its GPT-3 model, it’s still far away from a real understanding of what is said.

Traditional algorithms with their steps to extract meaning from given text documents or source material fail regardless of utilizing the latest deep learning technologies. There are far too many ambiguous aspects of language that are still very challenging. Humans understand the world, and they have a more or less clear picture of what the world and our existence is about. We know the past, present, and future. We have a clear perception of space and objects, and we know about abstraction and models.

In this way, humans are very smart in communicating and understanding other humans. Further, they build special groups and teams to dive deeper into topics or separate themselves from others. Language is about identity. A group of engineers talks and writes differently from a group of medical doctors. Documents used inside an insurance company are very different from discussion groups of the latest fashion on Reddit.

What is an Upper Ontology?

An established approach to transfer knowledge in a computer-readable format is the use of ontologies. Ontologies are graph-based structures that represent knowledge in a similar way as taxonomies do. You define relationships of concepts and definitions. Often an “is-a” relationship is used to show the connection between different words. So, a wide network of relationships is created. These relationships create meaning of known things. A daily encountered ontology is the google knowledge graph.

Here we can see knowledge covering 570 million entities and 18 billion facts.

The concept of an ontology is used widely for different domains. Each ontology covers a specific field of interest. And you can find a lot of ontologies on the internet. They cover science and research as well as day to day communication like news, blogs, and wikis. So a wide variety of knowledge is available in the form of ontologies.

Image, Semantic Interoperability with Upper Ontology:
The Foundation Ontology as a Basis for Semantic Interoperability Patrick Cassidy MICRA, Inc., Plainfield, NJ

But what do you do when you want to connect all these ontologies to create a canonical structure that has a broader scope? Here is the idea of an Upper Ontology comes into play. When you want to understand natural language with all its flexibility and switching of topics, you need a universal and basic structuring method. Upper Ontologies offer this flexibility as they define the fundamentals of understanding, and they connect other domain-specific ontologies.

The basics of Upper Ontology

An Upper Ontology provides the necessary semantic interoperability to infer meaning across multiple domains. Interoperability can be achieved by offering general concepts that provide the most fundamental understanding of our world. This is not an easy task, as schools of thinking and perception of order are different amongst scholars and practitioners. However, in general, there is a common understanding that the being itself and the world consists of categories and individuals, time and space, and objects and processes. At this foundational level, the discussion quickly turns into a philosophical one, which leads to disagreement on details or fights between schools of thinking. Often questions to describe useful aspects of the Upper Ontologies help to understand the school of a specific upper ontology.

Image, Outline of categories of the DOLCE Upper Ontology
Ontology Engineering Lecture 6: Top-down Ontology Development I By Maria Keet,

The charm of an upper ontology is the top-down approach that helps to lay the foundations for future extensions. In this way, it is possible to adapt and reuse existing domain-specific ontologies. With the creation of basic categories and relations, a higher order is created that covers all fields. And so, once it is established, natural language understanding with upper ontology covers all the different needs in one system.

Individuals and their categories

When talking about an upper ontology, also called top-level (or meta) ontology, or upper model, one main point is the definition of categories. As a precise definition of what a category seems to be a complex endeavor, upper ontologies focus on the distinction between categories and individuals. Individuals, also called particulars, are defined as concrete, spatiotemporal entities. They could be seen as persons, material objects, or events that have a definite place in space and time. In addition to that, things that could be identified through individuals are also individuals themselves. These are abstract things like thoughts or physical expressions of emotions like smiles.

Categories are different from that, and they also have their place in the Upper Ontologies. Their main characteristic is that they could be instantiated. That means that other categories or individuals can be of the kind the given category describes. All this is happening on a very abstract level, which helps with the interoperability of domain-specific ontologies. The challenge is that categories are hard to define in only one way. The name and the aspects of a category are depending on the school of thinking of its creator. Categories can have a history and intention, and they can follow different axioms. Categories can follow a continuous spectrum of dimensions, or they can be defined inconsistently.

All things are in time and space

Our perception of the world is defined by the continuity of time. We experience the past, present, and future. This reflects in our language. For an understanding of time, it’s important to define time as an interoperable concept. So, different expressions of time in domain-specific ontologies can be aligned. In this way, interoperable artifacts like schedules expressed in natural language can lead to an unambiguous understanding of time. Here two different convictions prevail. One is the continuity of time in the sense of intervals; the other one is focused on points in time. With this foundation, a unified time concept is created through an Upper Ontology.

Similar to the perception of time is the concept of space. We think in three dimensions, and we expect all physical things to behave accordingly. We expect things to have dimensions and precise scale. This basic understanding is covered by the Upper Ontology. However, we can express space in an absolute manner or talk about it as relative coordinates. And so do different domain ontologies. Some focus on precise definitions of space; others neglect all details. So, the Upper Ontology helps to connect different ontologies to come up with a united, context-aware meaning.

The existence of Objects and Processes

When we look at natural language understanding, the understanding of objects and processes is quite important. Things change over time, they move from one place to another. And we need to keep track of that to understand. Things are falling down, and they obey physics laws. People change their names when they get married, and they still remain the same. Titles are awarded, and products and companies get new names due to marketing.

These different representations need to be asserted in our real life. A logical-consistent model of the identity of objects and processes is needed to accurately reflect the world. So, when different domain-specific ontologies focus on different aspects of the same object or phenomenon, the Upper Ontology needs to be able to connect these different views on the world.

An Upper Ontology needs to assign identity conditions to objects and processes that proclaim their existence and continuity as an entity. In this way, different contexts give different aspects of an entity, but the entity stays the same, as long as the identity conditions are fulfilled. Change over time can be tracked, and entities can emerge and disappear. This understanding of the world helps to get things right in longer passages of text, where we need to identify the named things and processes correctly to get their meaning right.

It’s not easy to decide which Upper Ontology is the right one

Upper ontologies help NLU in offering universal general categories for semantic interoperability. They define concepts that are essential for understanding meaning. But the challenge we are facing is the multifaceted offer in Upper Ontologies. There are Descriptive Ontologies for Linguistic and Cognitive Engineering (DOLCE), Basic Formal Ontology (BFO), General Formal Ontology (GFO), Suggested Upper Merged Ontology (SUMO), and the list goes on.

When we compare the different Upper Ontologies, we’ll see that the school of thinking of their creators shines through the design. For example, DOLCE is methodologically fundamentally conceptualist, while BFO is methodologically fundamentally realist. In DOLCE, qualities are abstract entities without foundation in space or time, and they do not have parts. For BFO, DOLCE’s qualities, here called tropes, are located in space and exist at a time similar to the entities in which they inhere.

Image, Upper Ontology architecture for NLU:
The Foundation Ontology as a Basis for Semantic Interoperability Patrick Cassidy MICRA, Inc., Plainfield, NJ

As all Upper Ontologies offer different accentuations, when implementing a specific NLU system, an analysis of the existing Ontologies is necessary to find the right one matching your needs. Practical decision-making foundations help to create a useful tool. The Upper ontology knowledge needs to be fused with the traditional interpretation process that utilizes linguistically rich grammar formalisms.

However, the world’s knowledge and the approach to connect existing domain-specific knowledge into one model is very promising to tackle the prevalence of vague and ambiguous expressions in natural language. Real understanding can be achieved through access to massive numbers of existing domain ontologies and resolving ambiguity in a linguistic context with background knowledge. Another route is building a completely new model of how we interpret the meaning.