kandinsky_Thirty_(Trente)__1937

A Scientific Language

IEML is an acronym for Information Economy MetaLanguage. IEML is the result of thirty years of fundamental research under the direction of Pierre Lévy, fourty years of which were funded by the Canadian federal government through the Canada Research Chair in Collective Intelligence at the University of Ottawa (2002-2016).

For whom is it intended?

IEML is a multidisciplinary project at the confluence of AI, data science, linguistics, digital humanities, and philosophy. Because the metalanguage IEML has computable semantics it will be of interest to people working in the fields of artificial intelligence, business intelligence, and data science. This metalanguage proposes new uses and theory of metadata also relevant to researchers in the fields of heritage conservation (libraries, museums), digital humanities, and data journalism. Finally, since IEML increases collective intelligence, it will be of interest to practitioners in knowledge management, collaborative learning, and digital communications.

In this day and age, semantic interoperability among databases, languages, disciplines, etc. is a problem for a lot of professionals and researchers in the above-mentioned fields. In addition, after several years of deep learning frenzy, there is a renewed interest in symbolic AI (or at least in a synthesis between statistic and symbolic AI), and IEML is a powerful symbolic tool.

Main proprerties

In 2020, IEML is the only language that has the following three properties:

– it has the expressive power of a natural language;

– it has the syntax of a regular language;

– its semantics is unambiguous and computable, because it is aligned with its syntax.

In other words, it is a “well-formed symbolic system”, which comprises a bijection between a set of relations between signifieds, or meanings (a language) and a set of relations between signifiers (an algebra) and which can be manipulated by a set of symmetrical and automatic operations. 

On the basis of these properties, IEML can be used as a concept coding system that solves the problem of semantic interoperability in an original way, lays the foundations for a new generation of artificial intelligence and allows collective intelligence to be reflexive. IEML complies with Web standards and can be exported in RDF. IEML expressions are called USLs (Uniform Semantic Locators). They can be read and translated into any natural language. Semantic ontologies – sets of IEML expressions linked by a network of relations – are interoperable by design. IEML provides the coordinate system of a common knowledge base that feeds both automatic reasoning and statistical calculations. In sum, IEML fulfills the promise of the Semantic Web through its computable meaning and interoperable ontologies.

IEML’s grammar consists of three layers: morphemes, syntagms and texts. Examples of morphemes and syntagms can be found at https://dev.intlekt.io/.

Morphemes

Morphemes are the basic building blocks, or elementary concepts, from which all language expressions are composed. A dictionary of about 5000 morphemes translated into natural languages is given with IEML and shared among all its users. Semantic interoperability comes from the fact that everyone shares the same set of morphemes whose meanings are fixed. The dictionary is organized into tables and sub-tables related to the same theme and the morphemes are defined reciprocally through a network of explicit semantic relations. IEML allows the design of an unlimited variety of concepts from a limited number of morphemes. 

The user does not have to worry about the rules from which the morphemes are constructed. However, they are regularly generated from six primitive symbols forming the “layer 0” of the language, and since the generative operation is recursive, the morphemes are stratified on six layers above layer 0.

Syntagms 

Using the morpheme dictionary and grammar rules, users can freely model a field of knowledge or practice within IEML. These models can be original or translate existing classifications, ontologies or semantic metadata.

Each syntagm (word or sentence) corresponds to a distinct concept that can be translated, according to its author’s indications and its grammatical role, as a verb (encourage), a noun (courage), an adjective (courageous) or an adverb (bravely). 

Lexemes 

The basic unit of syntagms is the lexeme. A lexeme is a pair composed of two small sets of morphemes: content and inflection. The choice of content morphemes is free, but inflection morphemes are selected from a closed list of morpheme tables corresponding to adverbs, prepositions, postpositions, articles, conjugations, declensions, modes, etc. (see “auxiliary morphemes” in https://dev.intlekt.io/)

Syntagmatic roles 

The lexemes are distributed on a syntagmatic tree composed of a root (verbal or nominal) and eight leaves corresponding to the roles of classical grammar: subject, object, complement of time, place, etc. 

IEML syntagmatic roles

The nine syntagmatic roles

The root of the syntagm can be a process (a verb), a substance, an essence, an affirmation of existence… 

The initiator is the subject of a process, answering the question “who?” He can also define the initial conditions, the first motor, the first cause of the concept evoked by the syntagm.

The interactant corresponds to the object of classical grammar. It answers the question “what”. It also plays the role of medium in the relationship between the initiator and the recipient. 

The recipient is the beneficiary (or the victim) of a process. It answers the questions “for whom, to whom, towards whom?”. 

The Time answers the question “when?”. It indicates the moment in the past, the present or the future and gives references as to anteriority, posteriority, duration, date and frequency. 

The Place answers the question “where?”. It indicates the location, spatial distribution, pace of movement, paths, paths, spatial relationships and metaphors. 

The Intention answers the question of finality, purpose, motivation: “for what”, “to what end?”It concerns mental orientation, direction of action, pragmatic context, emotion or feeling.

The Manner answers the questions “how?” and “how much?”. It situates the syntagm on a range of qualities or on a scale of values. It specifies quantities, gradients, measurements and sizes. It also indicates properties, genres and styles.

The Causality answers the question “why? It specifies logical, material and formal determinations. It describes causes that have not been specified by the initiator, the interactant or the recipient: media, instruments, effects, consequences. It also describes the units of measurement and methods. It may also specify rules, laws, reasons, points of view, conditions and contracts.

For example: Robert (initiator) offers (root-process) a (interactant) gift to Mary (recipient) today (time) in the garden (place), to please her (intention), with a smile (manner), for her birthday (causality). 

Junctions 

IEML allows the junction of several lexemes in the same syntagmatic role. This can be a logical connection (and, or inclusive or exclusive), a comparison (same as, different from), an ordering (larger than, smaller than…), an antinomy (but, in spite of…), and so on.

Layers of complexity

Two layers of syntagmatic complexity : 73 roles

A lexeme that plays one of the eight leaf roles at complexity layer 1 can play the role of secondary root at a complexity layer 2, and so on recursively up to layer 4. By convention, syntagms are named according to their complexity layer as follows:

– 0 lexeme

– 1 word

– 2 super-words

– 3 sentence

– 4 super sentence

Literals

IEML strictly speaking enables only general categories or concepts to be expressed. It is nevertheless possible to insert numbers, units of measurement, dates, geographical positions, proper names, etc. into a syntagm, provided they are categorized in IEML. For example t.u.-t.u.-‘. [23] means ‘number: 23’. Individual names, numbers, etc. are called literals in IEML.

Texts 

Relations 

A semantic relationship is a syntagm in a special format that is used to link a source syntagm to a target syntagm. IEML includes a query language enabling easy programming of semantic relationships on a set of syntagms. 

By design, a semantic relationship makes the following four points explicit.

1. The function that connects the source syntagm and the target syntagm.

2. The mathematical form of the relation: equivalence relationship, order relationship, intransitive symmetrical relationship or intransitive asymmetrical relationship.

3. The kind of context or social rule that validates the relationship: syntax, law, entertainment, science, learning, etc.

4. The content of the relationship: logical, taxonomic, mereological (whole-part relationship), temporal, spatial, quantitative, causal, or other. The relation can also concern the reading order or the anaphora.

The (hyper) textual network

An IEML text is a network of semantic relationships between syntagms. This network can describe linear successions, trees, matrices, cliques, cycles and complex subnetworks of all types.

An IEML text can be considered as a theory, an ontology, or a narrative that accounts for the dataset it is used to index.

We can define a USL as an ordered (normalized) set of triples of the form : (a source syntagm, an target syntagm, a relationship syntagm).  A set of such triples describes a semantic network or IEML text. 

The following special cases can be noted:

– The network may contain only one syntagm.

– The syntagm may contain only one root to the exclusion of other syntactic roles.

– The root may contain only one lexeme (no junction).

– The lexeme may contain only one morpheme.

******* 

In short, IEML is a language with computable semantics that can be considered from three complementary points of view: linguistics, mathematics and computer science. Linguistically, it is a philological language, i.e. it can translate any natural language. Mathematically, it is a topos, that is, an algebraic structure (a category) in isomorphic relation with a topological space (a network of semantic relations). Finally, on the computer side, it functions as the indexing system of a virtual database and as a programming language for semantic networks.

Competition

As of today, IEML has no equivalent. However, let’s discuss two somewhat similar projects – one technical (the Semantic Web) and one linguistic (Lojban).

IEML and the Semantic Web

In terms of long-term vision, IEML’s main competitor is the WWW Consortium’s “Semantic Web,” including the RDF and OWL standards. The Semantic Web project was formulated at the end of the 20th century and is based on the availability of inference engines and ontologies (rule-based systems representing domain knowledge) that were developed at the time as the “expert systems” in the 1970s and 1980s. However, at the time not all computers were interconnected and the problem of semantic interoperability was not as acute as it is today. Yes, the Semantic Web makes it possible to compute semantic relationships and make logical inferences, but only within an ontology. There are thousands of different existing ontologies and translations between ontologies must be done “by hand.”

The Semantic Web and IEML have some features in common. In particular, both aim to “represent knowledge” for automatic reasoning. But they are two different endeavors: the Semantic Web is a set of standards for data import/export XML (Extensible Markup Language) and the description of logical relations between data RDF (Resource Description Framework), OWL (Ontology Web Language). None of these standards have – like IEML does – the properties of a philological language. The Semantic Web standards have been designed for computing the truth of propositions in whatever natural language they have been expressed in, while IEML computes the semantic relations between any IEML morphemes, words, sentences and texts (and IEML may be automatically translated in natural languages). IEML and OWL operate on very different levels but are not mutually exclusive: you can write IEML texts in XML, RDF, or in OWL files.

During its two decades in existence, the Semantic Web has achieved a lot of useful applications. But it has also shown that problems in semantic computing and interoperability can’t be solved by logic and classifications alone. IEML, being both a formal and a philological language, computes, generates, and recognizes automatically an infinity of concepts and their semantic relations. In short, IEML is not a universal ontology, it is a language with computable semantics that can express any ontology.

The World Wide Web Consortium project represents semantic relationships by means of logical formalism and its nodes are arbitrary sequences of characters (like URLs). In contrast, IEML represents semantic relationships through linguistic formalism with nodes (USLs, uniform semantic locators) that are expressions of a single regular language. Inherent to IEML expressions, semantic relationships are represented here in a much more parsimonious way.

Finally, for the Semantic Web, URLs are the last layer of (physical) addressing of data and a conceptual addressing system is impossible or even reprehensible. In contrast, IEML proposes in the long term a universal addressing system for concepts on the Internet: USLs (uniform semantic locators).

IEML and Lojban

People who are interested in artificial languages always mention Lobjan as a possible competitor to IEML because it has a logical and regular grammar.

Lojban’s grammar is inspired by pre-existing logical formalisms and not by an original, flexible, effective, and general abstract algebra like IEML.
The meaning of Lojban words is solely defined by a correspondence with natural languages and not by a coincidence of signifier (syntax) and signified (semantic) paradigms as found in IEML. Moreover, in Lojban the level of morphemes does not meet the standards of a regular language. In contrast, IEML semantics are entirely generative. In short, IEML is the only artificial language that – once given its semantic primitives – defines its semantics by means of its syntax.
Finally, Lojban is made to be spoken while IEML is designed from the beginning as a scientific writing to serve as a semantic metadata system capable of indexing all aspects of human activity, that can be manipulated and “understood” by computers and to solve the problem of semantic interoperability.

Philosophical and anthropological perspective

The human species can be defined by its special ability to manipulate symbols. Each great augmentation in this ability has brought enormous economic, social, political, religious, epistemological, educational (and so on) changes.

There have been only four of these big changes. The first one is related to the invention of writing, when symbols became permanent and reified. The second one corresponds to the invention of the alphabet, Indian numerals and other small groups of symbols able to represent “almost everything” by their combination. The third one is the invention of the printing press and the subsequent invention of electronic mass media. In this case, the symbols were reproduced and transmitted by industrial machines. We are currently at the beginning of a fourth big anthropological change because the symbols can now be transformed by massively distributed automata in the digital realm. We still do not have invented the symbolic systems and cultural institutions fitting the new algorithmic medium. So my research in the past 20 years has been devoted to the invention of a symbolic system able to exploit the computational power, the capacity of memory and the ubiquity of the Internet.

This is the main motivation behind my work on IEML: I took up the challenge of inventing a symbolic system that makes the most of the new digital environment to serve human cognitive augmentation better.

Follow me and IEML on Twitter
For scientific publications, reports and other documents, look here.

Prof. Pierre Lévy, PhD., University of Montreal
Fellow of the Royal Society of Canada