Archives for category: Semantic Sphere

Today, artificial intelligence is divided between two major trends: symbolic and statistical. The symbolic branch corresponds to what has been successively called in the last 70 years semantic networks, expert systems, semantic web and more recently, knowledge graphs. Symbolic AI codes human knowledge in the form of networks of relationships between concepts ruled by models and ontologies which give leverage to automatic reasoning. The statistical branch of AI trains algorithms to recognize visual, linguistic or other forms from large masses of data, relying on neural models roughly imitating the learning mode of the brain. Neuro-mimetic artificial intelligence has existed since the beginnings of computer science (see the work of McCulloch and von Foerster) but has only become useful because of the increase in computing power available since 2010. In the early 2020s, these two currents are merging according to a hybrid or neuro-symbolic model which seems very promising. Though many problems still remain, in terms of the consistency and interoperability of metadata.

Big tech companies and a growing number of scientific, economic and social sectors use knowledge graphs. Despite the availability of the WWW Consortium metadata standards for marking classifications and ontologies (RDF, OWL) the different sectors (see the slide below) do not communicate with each other and – even worse – divergent systems of categories and relationships are most often in use within the same domain. The interoperability of metadata standards – such as RDF – only addresses the compatibility of digital files. It should not be confused with true semantic interoperability, which addresses concept architectures and models. In reality, the problem of semantic interoperability has yet to be solved in 2021, and there are many causes for the opacity that plagues digital memory. Natural languages are multiple, informal, ambiguous and changing. Cultures and disciplines tend to divide reality in different ways. Finally, often inherited from the age of print, the numerous metadata systems in place to classify data are incompatible like thesauri, documentary languages, ontologies, taxonomies, folksonomies, sets of tags or hashtags, keywords, etc.

The Conundrum of Semantic Interoperability

There is currently no way to code linguistic meaning in a uniform and computable way, the way we code images using pixels or vectors for instance. To represent meaning, we are still using natural languages which are notoriously multiple, changing and ambiguous. With the notable exception of number notation and mathematical codes, our writing systems are primarily designed to represent sounds. Their representation of categories or concepts is indirect (characters → sound → concepts) and difficult for computers to grasp. Computers can handle syntax (the regular arrangement of characters), but their handling of semantics remains imperfect and laborious. Despite the success of machine translation (Deep L, Google translate) and automatic text generation (GPT3), computers don’t really understand the meaning of the texts they read or write.

Now, how can we resolve the problem of semantic interoperability and progress towards a thorough automatic processing of meaning? Many advances in computer science come from the invention of a relevant coding system making the coded object (number, image, sound, etc.) easily computable. The goal of our company INTLEKT Metadata Inc. has been to make concepts, categories or linguistic meaning systematically computable. In order to solve this problem, we have designed the Information Economy MetaLanguage: IEML. This metalanguage has a compact dictionary of less than 5000 words. IEML words are organized by subject-oriented paradigms and visualized as keyboards. The grammar of this metalanguage is completely regular and embedded in the IEML editor. Thank to this grammar, complex concepts and relations can be recursively constructed by combining simpler ones. It is not a super-ontology (like Cyc) but a programmable language (akin to a computable Esperanto) able to translate any ontology and to connect any possible categories. By using such a semantic code, artificial intelligence could take a giant step forward feeding collective intelligence.  Public health data from all countries would not only be able to communicate with each other, but could also harmonize with economic and social data. Occupational classifications and different international labour market statistics would automatically translate into each other. The AI of smart contracts, international e-commerce and the Internet of Things would exchange data and execute instructions based on automatic reasoning. Government statistics, national libraries, major museums and digital humanities research would feed into each other. On the machine learning side, we would reach a system of uniform and precise labels and annotations that would help AI to become more ethical, transparent, and efficient. A common semantic code would make it finally possible to achieve a de-fragmentation of the global memory and an integration of symbolic and statistical AI. The only price to pay for reaching neuro-symbolic collective intelligence would be a concerted effort for training specialists to translate metadata into IEML.

Check our prototype: https://dev.intlekt.io/

  • Once you are on the site, on the top right you can choose between french and english
  • “USL” (Uniform Semantic Locator) allows the search for words and paradigms in the dictionary
  • “Tags” gives you some examples of USLs groups by domain
  • If you are in “USL” the search for IEML expressions (instead of natural language translations) is done by typing * at the beginning of the query
  • Type: choose “all”
  • Class: filters nouns verbs or auxiliaries
  • Cardinality: choose “root” paradigms (big tables, or multi-tables paradigms), or the (small) tables, or singular = individual words. It is recommended to explore the dictionary by “roots”
  • When you click on a search result, the corresponding paradigm appears on the right.
  • The right panel present certain relations according to the selected words.

IEML is patented (provisional: US 63/124,924) and belongs to INTLEKT Metadata Inc.

Vassili Kandinsky: Circles in a Circle

A Scientific Language

IEML is an acronym for Information Economy MetaLanguage. IEML is the result of many years of fundamental research under the direction of Pierre Lévy, fourteen years of which were funded by the Canadian federal government through the Canada Research Chair in Collective Intelligence at the University of Ottawa (2002-2016). In 2020, IEML is the only language that has the following three properties:

– it has the expressive power of a natural language;

– it has the syntax of a regular language;

– its semantics is unambiguous and computable, because it is aligned with its syntax.

In other words, it is a “well-formed symbolic system”, which comprises a bijection between a set of relations between signifieds, or meanings (a language) and a set of relations between signifiers (an algebra) and which can be manipulated by a set of symmetrical and automatic operations. 

On the basis of these properties, IEML can be used as a concept coding system that solves the problem of semantic interoperability in an original way, lays the foundations for a new generation of artificial intelligence and allows collective intelligence to be reflexive. IEML complies with Web standards and can be exported in RDF. IEML expressions are called USLs (Uniform Semantic Locators). They can be read and translated into any natural language. Semantic ontologies – sets of IEML expressions linked by a network of relations – are interoperable by design. IEML provides the coordinate system of a common knowledge base that feeds both automatic reasoning and statistical calculations. In sum, IEML fulfills the promise of the Semantic Web through its computable meaning and interoperable ontologies. IEML’s grammar consists of four layers: elements, words, sentences and texts. Examples of elements and words can be found at https://dev.intlekt.io/.

Elements

The semantic elements are the basic building blocks, or elementary concepts, from which all language expressions are composed. A dictionary of about 5000 elements translated into natural languages is given with IEML and shared among all its users. Semantic interoperability comes from the fact that everyone shares the same set of elements whose meanings are fixed. The dictionary is organized into tables and sub-tables related to the same theme and the elements are defined reciprocally through a network of explicit semantic relations. IEML allows the design of an unlimited variety of concepts from a limited number of elements. 

Exemple of an elements paradigm in the IEML dictionary

The user does not have to worry about the rules from which the elements are constructed. However, they are regularly generated from six primitive symbols forming the “layer 0” of the language, and since the generative operation is recursive, the elements are stratified on six layers above layer 0.

Words

Using the elements dictionary and grammar rules, users can freely model a field of knowledge or practice within IEML. These models can be original or translate existing classifications, ontologies or semantic metadata.

The basic unit of an IEML sentence is the word. A word is a pair composed of two small sets of elements: the radical and the inflection. The choice of radical elements is free, but inflection elements are selected from a closed list of elements tables corresponding to adverbs, prepositions, postpositions, articles, conjugations, declensions, modes, etc. (see “auxiliary morphemes” in https://dev.intlekt.io/)

Each word or sentence corresponds to a distinct concept that can be translated, according to its author’s indications and its grammatical role, as a verb (encourage), a noun (courage), an adjective (courageous) or an adverb (bravely). 

Sentences 

The words are distributed on a grammatical tree composed of a root (verbal or nominal) and eight leaves corresponding to the roles of classical grammar: subject, object, complement of time, place, etc. 

The nine grammatical roles

Nine grammatical roles

The Root of the sentence can be a process (a verb), a substance, an essence, an affirmation of existence… 

The Initiator is the subject of a process, answering the question “who?” He can also define the initial conditions, the first motor, the first cause of the concept evoked by the root.

The Interactant corresponds to the object of classical grammar. It answers the question “what”. It also plays the role of medium in the relationship between the initiator and the recipient. 

The Recipient is the beneficiary (or the victim) of a process. It answers the questions “for whom, to whom, towards whom?”. 

The Time answers the question “when?”. It indicates the moment in the past, the present or the future and gives references as to anteriority, posteriority, duration, date and frequency. 

The Place answers the question “where?”. It indicates the location, spatial distribution, pace of movement, paths, paths, spatial relationships and metaphors. 

The Intention answers the question of finality, purpose, motivation: “for what”, “to what end?”It concerns mental orientation, direction of action, pragmatic context, emotion or feeling.

The Manner answers the questions “how?” and “how much?”. It situates the root on a range of qualities or on a scale of values. It specifies quantities, gradients, measurements and sizes. It also indicates properties, genres and styles.

The Causality answers the question “why? It specifies logical, material and formal determinations. It describes causes that have not been specified by the initiator, the interactant or the recipient: media, instruments, effects, consequences. It also describes the units of measurement and methods. It may also specify rules, laws, reasons, points of view, conditions and contracts.

For example: Robert (initiator) offers (root-process) a (interactant) gift to Mary (recipient) today (time) in the garden (place), to please her (intention), with a smile (manner), for her birthday (causality). 

Junctions 

IEML allows the junction of several words in the same grammatical role. This can be a logical connection (and, or inclusive or exclusive), a comparison (same as, different from), an ordering (larger than, smaller than…), an antinomy (but, in spite of…), and so on.

Layers of complexity

Grammatical roles of a complex sentence

A word that plays one of the eight leaf roles at complexity layer 1 can play the role of secondary root at a complexity layer 2, and so on recursively up to layer 4.

Literals

IEML strictly speaking enables only general categories or concepts to be expressed. It is nevertheless possible to insert numbers, units of measurement, dates, geographical positions, proper names, etc. into a sentence, provided they are categorized in IEML. For example t.u.-t.u.-‘. [23] means ‘number: 23’. Individual names, numbers, etc. are called literals in IEML.

Texts 

Relations 

A semantic relationship is a sentence in a special format that is used to link a source node (element, word, sentence) to a target node. IEML includes a query language enabling easy programming of semantic relationships on a set of nodes. 

By design, a semantic relationship makes the following four points explicit.

1. The function that connects the source node and the target node.

2. The mathematical form of the relation: equivalence relationship, order relationship, intransitive symmetrical relationship or intransitive asymmetrical relationship.

3. The kind of context or social rule that validates the relationship: syntax, law, entertainment, science, learning, etc.

4. The content of the relationship: logical, taxonomic, mereological (whole-part relationship), temporal, spatial, quantitative, causal, or other. The relation can also concern the reading order or the anaphora.

The (hyper) textual network

An IEML text is a network of semantic relationships. This network can describe linear successions, trees, matrices, cliques, cycles and complex subnetworks of all types.

An IEML text can be considered as a theory, an ontology, or a narrative that accounts for the dataset it is used to index.

We can define a USL as an ordered (normalized) set of triples of the form : (a source node, a target node, a relationship sentence).  A set of such triples describes a semantic network or IEML text. 

The following special cases should be noted:

– A network may contain only one sentence.

– A sentence may contain only one root to the exclusion of other grammatical roles.

– A root may contain only one word (no junction).

– A word may contain only one element.

******* 

In short, IEML is a language with computable semantics that can be considered from three complementary points of view: linguistics, mathematics and computer science. Linguistically, it is a philological language, i.e. it can translate any natural language. Mathematically, it is a topos, that is, an algebraic structure (a category) in isomorphic relation with a topological space (a network of semantic relations). Finally, on the computer side, it functions as the indexing system of a virtual database and as a programming language for semantic networks.

More than 60% of the human population is connected to the Internet, most sectors of activity have switched to digital and software drives innovation. Yet Internet standards and protocols were invented at a time when less than one percent of the population was connected. It is time to use the data flows, the available computing power and the possibilities of interactive communication for human development… and to solve the serious problems we are facing. That is why I will launch soon a major international project – comparable to the construction of a cyclotron or a voyage to Mars – aiming at an augmentation of the Internet in the service of collective intelligence.

This project has several interrelated objectives: 

  • Decompartmentalize digital memory and ensure its semantic (linguistic, cultural and disciplinary) interoperability.
  • Open up indexing modes and maximize the diversity of interpretations of the digital memory.
  • Make communication between machines, but also between humans and machines, more fluid in order to enforce our collective mastery of the Internet of Things, intelligent cities, robots, autonomous vehicles, etc.
  • Establish new forms of modeling and reflexive observation of human collective intelligence on the basis of our common memory.

IEML

The technical foundation of this project is IEML (Information Economy MetaLanguage), a semantic metadata system that I invented with support from the Canadian federal government. IEML has :

  • the expressive power of a natural language, 
  • the syntax of a regular language, 
  • calculable semantics aligned with its syntax.

IEML is exported in RDF and is based on Web standards. IEML concepts are called USLs (Uniform Semantic Locators). They can be read and translated into any natural language. Semantic ontologies – sets of USLs linked by a network of relationships – are interoperable by design. IEML establishes a virtual knowledge base that feeds both automatic reasoning and statistical calculations. In short, IEML fulfills the promise of the Semantic Web through its computable meaning and interoperable ontologies.

For a short description of the IEML grammar, click here.

Intlekt

The URLs system and the http standard only become useful through a browser. Similarly, the new IEML-based semantic addressing system for the Internet requires a special application, called Intlekt, whose technical project manager is Louis van Beurden. Intlekt is a collaborative and distributed platform that supports concept editing, data curation and new forms of search, data mining and data visualization. 

Intlekt empowers the edition and publishing of semantic ontologies – sets of linked concepts – related to a field of practice or knowledge. These ontologies can be original or translate existing semantic metadata such as: thesauri, documentary languages, ontologies, SKOS taxonomies, folksonomies, sets of tags or hashtags, keywords, column and row headings, etc. Published semantic ontologies augment a dictionary of concepts, which can be considered as an open meta-ontology

Intlekt is also a data curation tool. It enables editing, indexing in IEML and publishing data collections that feed a common knowledge base. Eventually, statistical algorithms will be used to automate the semantic indexing of data.

Finally, Intlekt exploits the properties of IEML to allow new forms of search, automatic reasoning and simulation of complex systems.

Special applications can be imagined in many areas, like:

  • the preservation of cultural heritage, 
  • research in the humanities (digital humanities), 
  • education and training
  • public health, 
  • informed democratic deliberation, 
  • commercial transactions, 
  • smart contracts, 
  • the Internet of things, 
  • and so on…

And now, what?

Where do we stand on this project in the summer of 2020? After many tests over several years, IEML’s grammar has stabilized, as well as the base of morphemes of about 5000 units which enables any concept to be built at will. I tested positively the expressive possibilities of the language in several fields of humanities and earth sciences. Nevertheless, at the time of writing, the latest state of the grammar is not yet implemented. Moreover, to obtain a version of Intlekt that enables the semantic ontology editing, data curation and data mining functions described above, a team of several programmers working for one year is needed. In the coming months, the friends of IEML will be busy pursuing this critical mass. 

Come and join us!

For more information, see: https://pierrelevyblog.com/my-research-in-a-nutshell/ and https://pierrelevyblog.com/my-research-in-a-nutshell/the-basics-of-ieml/

IEML (the Information Economy Meta Language) has four main directions of research and development in 2019: in mathematics, data science, linguistics and software development. This blog entry reviews them successively.

1- A mathematical research program

I will give here a philosophical description of the structure of IEML, the purpose of the mathematical research to come being to give a formal description and to draw from this formalisation as much useful information as possible on the calculation of relationships, distances, proximities, similarities, analogies, classes and others… as well as on the complexity of these calculations. I had already produced a formalization document in 2015 with the help of Andrew Roczniak, PhD, but this document is now (2019) overtaken by the evolution of the IEML language. The Brazilian physicist Wilson Simeoni Junior has volunteered to lead this research sub-program.

IEML Topos

The “topos” is a structure that was identified by the great mathematician Alexander Grothendieck, who “is considered as the re-founder of algebraic geometry and, as such, as one of the greatest mathematicians of the 20th century” (see Wikipedia).

Without going into technical details, a topos is a bi-directional relationship between, on the one hand, an algebraic structure, usually a “category” (intuitively a group of transformations of transformation groups) and, on the other hand, a spatial structure, which is geometric or topological. 

In IEML, thanks to a normalization of the notation, each expression of the language corresponds to an algebraic variable and only one. Symmetrically, each algebraic variable corresponds to one linguistic expression and only one. 

Topologically, each variable in IEML algebra (i.e. each expression of the language) corresponds to a “point”. But these points are arranged in different nested recursive complexity scales: primitive variables, morphemes of different layers, characters, words, sentences, super-phrases and texts. However, from the level of the morpheme, the internal structure of each point – which comes from the function(s) that generated the point – automatically determines all the semantic relationships that this point has with the other points, and these relationships are modelled as connections. There are obviously a large number of connection types, some very general (is contained in, has an intersection with, has an analogy with…) others more precise (is an instrument of, contradicts X, is logically compatible with, etc.).

The topos that match all the expressions of the IEML language with all the semantic relationships between its expressions is called “The Semantic Sphere”.

Algebraic structure of IEML

In the case of IEML, the algebraic structure is reduced to 

  • 1. Six primitive variables 
  • 2. A non-commutative multiplication with three variables (substance, attribute and mode). The IEML multiplication is isomorphic to the triplet ” departure vertex, arrival vertex, edge ” which is used to describe the graphs.
  • 3. A commutative addition that creates a set of objects.

This algebraic structure is used to construct the following functions and levels of variables…

1. Functions using primitive variables, called “morpheme paradigms”, have as inputs morphemes at layer n and as outputs morphemes at layer n+1. Morpheme paradigms include additions, multiplications, constants and variables and are visually presented in the form of tables in which rows and columns correspond to certain constants.

2. “Character paradigms” are complex additive functions that take morphemes as inputs and characters as outputs. Character paradigms include a group of constant morphemes and several groups of variables. A character is composed of 1 to 5 morphemes arranged in IEML alphabetical order. (Characters may not include more than five morphemes for cognitive management reasons).

3. IEML characters are assembled into words (a substance character, an attribute character, a mode character) by means of a multiplicative function called a “word paradigm”. A word paradigm intersects a series of characters in substance and a series of characters in attribute. The modes are chosen from predefined auxiliary character paradigms, depending on whether the word is a noun, a verb or an auxiliary. Words express subjects, keywords or hashtags. A word can be composed of only one character.

4. Sentence building functions assemble words by means of multiplication and addition, with the necessary constraints to obtain grammatical trees. Mode words describe the grammatical/semantic relationships between substance words (roots) and attribute words (leaves). Sentences express facts, proposals or events; they can take on different pragmatic and logical values.

5. Super-sentences are generated by means of multiplication and addition of sentences, with constraints to obtain grammatical trees. Mode sentences express relationships between substance sentences and attribute sentences. Super-sentences express hypotheses, theories or narratives.

6. A USL (Uniform Semantic Locator) or IEML text is an addition (a set) of words, sentences and super-sentences. 

Topological structure of IEML: a semantic rhizome

Static

The philosophical notion of rhizome (a term borrowed from botany) was developed on a philosophical level by Deleuze and Guattari in the preface to Mille Plateaux (Minuit 1980). In this Deleuzo-Guattarian lineage, by rhizome I mean here a complex graph whose points or “vertices” are organized into several levels of complexity (see the algebraic structure) and whose connections intersect several regular structures such as series, tree, matrix and clique. In particular, it should be noted that some structures of the IEML rhizome combine hierarchical or genealogical relationships (in trees) with transversal or horizontal relationships between “leaves” at the same level, which therefore do not respect the “hierarchical ladder”. 

Dynamic

We can distinguish the abstract, or virtual, rhizomatic grid drawn by the grammar of the language (the sphere to be dug) and the actualisation of points and relationships by the users of the language (the dug sphere of chambers and galleries).  Characters, words, sentences, etc. are all chambers in the centre of a star of paths, and the generating functions establish galleries of “rhizomatic” relationships between them, as many paths for exploring the chambers and their contents. It is therefore the users, by creating their lexicons and using them to index their data, communicate and present themselves, who shape and grow the rhizome…

Depending on whether circuits are more or less used, on the quantity of data or on the strength of interactions, the rhizome undergoes – in addition to its topological transformations – various types of quantitative or metric transformations. 

* The point to remember is that IEML is a language with calculable semantics because it is also an algebra (in the broad sense) and a complex topological space. 

* In the long term, IEML will be able to serve as a semantic coordinate system for the information world at large.

2 A research program in data science

The person in charge of the data science research sub-program is the software engineer (Eng. ENSIMAG, France) Louis van Beurden, who holds also a master’s degree in data science and machine translation from the University of Montréal, Canada. Louis is planning to complete a PhD in computer science in order to test the hypothesis that, from a data science perspective, a semantic metadata system in IEML is more efficient than a semantic metadata system in natural language and phonetic writing. This doctoral research will make it possible to implement phases A and B of the program below and to carry out our first experiment.

Background information

The basic cycle in data science can be schematized according to the following loop:

  • 1. selection of raw data,
  • 2. pre-processing, i.e. cleaning data and metadata imposition (cataloguing and categorization) to facilitate the exploitation of the results by human users,
  • 3. statistical processing,
  • 4. visual and interactive presentation of results,
  • 5. exploitation of the results by human users (interpretation, storytelling) and feedback on steps 1, 2, 3

Biases or poor quality of results may have several causes, but often come from poor pre-treatment. According to the old computer adage “garbage in, garbage out“, it is the professional responsibility of the data-scientists to ensure the quality of the input data and therefore not to neglect the pre-processing phase where this data is organized using metadata.

Two types of metadata can be distinguished: 1) semantic metadata, which describes the content of documents or datasets, and 2) ordinary metadata, which describes authors, creation dates, file types, etc. Let us call “semantic pre-processing” the imposition of semantic metadata on data.

Hypothesis

Since IEML is a univocal language and the semantic relationships between morphemes, words, sentences, etc. are mathematically computable, we assume that a semantic metadata system in IEML is more efficient than a semantic metadata system in natural language and phonetic writing. Of course, the efficiency in question is related to a particular task: search, data analysis, knowledge extraction from data, machine learning, etc.

In other words, compared to a “tokenization” of semantic metadata in phonetic writing noting a natural language, a “tokenization” of semantic metadata in IEML would ensure better processing, better presentation of results to the user and better exploitation of results. In addition, semantic metadata in IEML would allow datasets that use different languages, classification systems or ontologies to be de-compartmentalized, merged and compared.

Design of the first experience

The ideal way to do an experiment is to consider a multi-variable system and transform only one of the system variables, all other things being equal. In our case, it is only the semantic metadata system that must vary. This will make it easy to compare the system’s performance with one (phonetic tokens) or the other (semantic tokens) of the semantic metadata systems.

  • – The dataset of our first experience encompasses all the articles of the Sens Public scientific journal.
  • – Our ordinary metadata are the author, publication date, etc.
  • – Our semantic metadata describe the content of articles.
  •     – In phonetic tokens, using RAMEAU categories, keywords and summaries,
  •     – In IEML tokens by translating phonetic tokens.
  • – Our processes are “big data” algorithms traditionally used in natural language processing 
  •     – An algorithm for calculating the co-occurrences of keywords.
  •     – A TF-IDF (Term Frequency / Inverse Document Frequency) algorithm that works from a word / document matrix.
  •     – A clustering algorithm based on “word embeddings” of keywords in articles (documents are represented by vectors, in a space with as many dimensions as words).
  • – A user interface will offer a certain way to access the database. This interface will be obviously adapted to the user’s task (which remains to be chosen, but could be of the “data analytics” type).
  • Result 1 corresponds to the execution of the “machine task”, i.e. the establishment of a connection network on the articles (relationships, proximities, groupings, etc.). We’ll have to compare….
  •     – result 1.1 based on the use of phonetic tokens with 
  •     – result 1.2 based on the use of IEML tokens.
  • Result 2 corresponds to the execution of the selected user-task (data analytics, navigation, search, etc.). We’ll have to compare….
  •     – result 2.1, based on the use of phonetic tokens, with 
  •     – result 2.2, based on the use of IEML tokens.

Step A: First indexing of a database in IEML

Reminder: the data are the articles of the scientific journal, the semantic metadata are the categories, keywords and summaries of the articles. From the categories, keywords and article summaries, a glossary of the knowledge area covered by the journal is created, or a sub-domain if it turns out that the task is too difficult. It should be noted that in 2019 we do not yet have the software tools to create IEML sentences and super-phrases that allow us to express facts, proposals, theories, narratives, hypotheses, etc. Phrases and super-phrases, perhaps accessible in a year or two, will therefore have to wait for a later phase of the research.

The creation of the glossary will be the work of a project community, linked to the editors of Sens-Public magazine and the Canada Research Chair in Digital Writing (led by Prof. Marcello Vitali-Rosati) at the Université de Montréal (Digital Humanities). Pierre Lévy will accompany this community and help it to identify the constants and variables of its lexicon. One of the auxiliary goals of the research is to verify whether motivated communities can appropriate IEML to categorize their data. Once we are satisfied with the IEML indexing of the article database, we will proceed to the next step.

Step B: First experimental test

  • 1. The test is determined to measure the difference between results based on phonetic tokens and results based on IEML tokens. 
  • 2. All data processing operations are carried out on the data.
  • 3. The results (machine tasks and user tasks) are compared with both types of tokens.

The experiment can eventually be repeated iteratively with minor modifications until satisfactory results are achieved.

If the hypothesis is confirmed, we proceed to the next step

Step C: Towards an automation of semantic pre-processing in IEML.

If the superior efficiency of IEML tokens for semantic metadata is demonstrated, then there will be a strong interest in maximizing the automation of IEML semantic pre-processing

The algorithms used in our experiment are themselves powerful tools for data pre-processing, they can be used, according to methods to be developed, to partially automate semantic indexing in IEML. The “word embeddings” will make it possible to study how IEML words are correlated with the natural language lexical statistics of the articles and to detect anomalies. For example, we will check if similar USLs (a USL is an IEML text) point to very different texts or if very different texts have similar USLs. 

Finally, methods will be developed to use deep learning algorithms to automatically index datasets in IEML.

Step D: Research and development perspective in Semantic Machine Learning

If step C provides the expected results, i.e. methods using AI to automate the indexing of data in IEML, then big data indexed in IEML will be available.  As progress will be made, semantic metadata may become increasingly similar to textual data (summary of sections, paragraphs, sentences, etc.) until translation into IEML is achieved, which remains a distant objective.

The data indexed in IEML could then be used to train artificial intelligence algorithms. The hypothesis that machines learn more easily when data is categorized in IEML could easily be validated by experiments of the same type as described above, by comparing the results obtained from training data indexed in IEML and the results obtained from the same data indexed in natural languages.

This last step paves the way for a better integration of statistical AI and symbolic AI (based on facts and rules, which can be expressed in IEML).

3 A research program in linguistics, humanities and social sciences

Introduction

The semiotic and linguistic development program has two interdependent components:

1. The development of the IEML metalanguage

2. The development of translation systems and bridges between IEML and other sign systems, in particular… 

  •     – natural languages,
  •     – logical formalisms,
  •     – pragmatic “language games” and games in general,
  •     – iconic languages,
  •     – artistic languages, etc.

This research and development agenda, particularly in its linguistic dimension, is important for the digital humanities. Indeed, IEML can serve as a system of semantic coordinates of the cultural universe, thus allowing the humanities to cross a threshold of scientific maturity that would bring their epistemological status closer to that of the natural sciences. Using IEML to index data and to formulate assumptions would result in….

  • (1) a de-silo of databases used by researchers in the social sciences and humanities, which would allow for the sharing and comparison of categorization systems and interpretive assumptions;
  • (2) an improved analysis of data.
  • (3) The ultimate perspective, set out in the article “The Role of the Digital Humanities in the New Political Space” (http://sens-public.org/article1369.html in French), is to aim for a reflective collective intelligence of the social sciences and humanities research community. 

But IEML’s research program in the perspective of the digital humanities – as well as its research program in data science – requires a living and dynamic semiotic and linguistic development program, some aspects of which I will outline here.

IEML and the Meaning-Text Theory

IEML’s linguistic research program is very much based on the Meaning-Text theory developed by Igor Melchuk and his school. “The main principle of this theory is to develop formal and descriptive representations of natural languages that can serve as a reliable and convenient basis for the construction of Meaning-Text models, descriptions that can be adapted to all languages, and therefore universal. ”(Excerpt translated from the Wikipedia article on Igor Melchuk). Dictionaries developed by linguists in this field connect words according to universal “lexical functions” identified through the analysis of many languages. These lexical functions have been formally transposed into the very structure of IEML (See the IEML Glossary Creation Guide) so that the IEML dictionary can be organized by the same tools (e.g. Spiderlex) as those of the Meaning-Text Theory research network. Conversely, IEML could be used as a pivot language – or concept description language – *between* the natural language dictionaries developed by the network of researchers skilled in Meaning-Text theory.

Construction of specialized lexicons in the humanities and social sciences

A significant part of the IEML lexicon will be produced by communities having decided to use IEML to mark out their particular areas of knowledge, competence or interaction. Our research in specialized lexicon construction aims to develop the best methods to help expert communities produce IEML lexicons. One of the approaches consists in identifying the “conceptual skeleton” of a domain, namely its main constants in terms of character paradigms and word paradigms. 

The first experimentation of this type of collaborative construction of specialized lexicons by experts will be conducted by Pierre Lévy in collaboration with the editorial team of the Sens Public scientific journal and the Canada Research Chair in Digital Textualities at the University of Montréal (led by Prof. Marcello Vitali-Rosati). Based on a determination of their economic and social importance, other specialized glossaries can be constructed, for example on the theme of professional skills, e-learning resources, public health prevention, etc.

Ultimately, the “digital humanities” branch of IEML will need to collaboratively develop a conceptual lexicon of the humanities to be used for the indexation of books and articles, but also chapters, sections and comments in documents. The same glossary should also facilitate data navigation and analysis. There is a whole program of development in digital library science here. I would particularly like to focus on the human sciences because the natural sciences have already developed a formal vocabulary that is already consensual.

Construction of logical, pragmatic and narrative character-tools

When we’ll have a sentence and super-phrase editor, it is planned to establish a correspondence between IEML – on the one hand – and propositional calculus and first order logics – on the other hand –. This will be done by specifying special character-tools to implement logical functions. Particular attention will be paid to formalizing the definition of rules and the declaration that “facts” are true in IEML. It should be noted in passing that, in IEML, grammatical expressions represent classes, sets or categories, but that logical individuals (proper names, numbers, etc.) or instances of classes are represented by “literals” expressed in ordinary characters (phonetic alphabets, Chinese characters, Arabic numbers, URLs, etc.).

In anticipation of practical use in communication, games, commerce, law (smart contracts), chatbots, robots, the Internet of Things, etc., we will develop a range of character-tools with illocutionary force such as “I offer”, “I buy”, “I quote”, “I give an instruction”, etc.

Finally, we will making it easier for authors of super-sentences by developing a range of character-tools implementing “narrative functions”.

4 A software development program

A software environment for the development and public use of the IEML language

Logically, the first multi-user IEML application will be dedicated to the development of the language itself. This application is composed of the following three web modules.

  • 1. A morpheme editor that also allows you to navigate in the morphemes database, or “dictionary”.
  • 2. A character and word editor that also allows navigation in the “lexicon”.
  • 3. A navigation and reading tool in the IEML library as a whole, or “IEML database” that brings together the dictionary and lexicon, with translations, synonyms and comments in French and English for the moment.

The IEML database is a “Git” database and is currently hosted by GitHub. Indeed, a Git database makes it possible to record successive versions of the language, as well as to monitor and model its growth. It also allows large-scale collaboration among teams capable of developing specific branches of the lexicon independently and then integrating them into the main branch after discussion, as is done in the collaborative development of large software projects. As soon as a sub-lexicon is integrated into the main branch of the Git database, it becomes a “common” usable by everyone (according to the latest General Public License version.

Morpheme and word editors are actually “Git clients” that feed the IEML database. A first version of this collaborative read-write environment should be available in the fall of 2019 and then tested by real users: the editors of the Scientific Journal “Sens Public” as well as other participants in the University of Montréal’s IEML seminar.

The following versions of the IEML read/write environment should allow the editing of sentences and texts as well as literals that are logical individuals not translated into IEML, such as proper names, numbers, URLs, etc.

A social medium for collaborative knowledge management

A large number of applications using IEML can be considered, both commercial and non-commercial. Among all these applications, one of them seems to be particularly aligned with the public interest: a social medium dedicated to collaborative knowledge and skills management. This new “place of knowledge” could allow the online convergence of the missions of… 

  • – museums and libraries, 
  • – schools and universities, 
  • – companies and administrations (with regard to their knowledge creation and management dimension), 
  • – smart cities, employment agencies, civil society networks, NGO, associations, etc.

According to its general philosophy, such a social medium should…

  • – be supported by an intrinsically distributed platform, 
  • – have the simplicity – or the economy of means – of Twitter,
  • – ensure the sovereignty of users over their data,
  • – promote collaborative processes.

The main functions performed by this social medium would be:

  • – data curation (reference and categorization of web pages, edition of resource collections), 
  • – teaching offers and learning demands,
  • – offers and demands for skills, or employment market.

IEML would serve as a common language for

  • – data categorization, 
  • – description of the knowledge and skills, 
  • – the expression of acts within the social medium (supply, demand, consent, publish, etc.)
  • – addressing users through their knowledge and skills.

Three levels of meaning would thus be formalized in this medium.

  • (1) The linguistic level in IEML  – including lexical and narrative functions – formalizes what is spoken about (lexicon) and what is said (sentences and super-phrases).
  • – (2) The logical – or referential – level adds to the linguistic level… 
  •     – logical functions (first order logic and propositional logic) expressed in IEML using logical character-tools,
  •     – the ability of pointing to references (literals, document URLs, datasets, etc.),
  •     – the means to express facts and rules in IEML and thus to feed inference engines.
  • – (3) The pragmatic level adds illocutionary functions and users to the linguistic and logical levels.
  •     – Illocutionary functions (thanks to pragmatic character-tools) allow the expression of conventional acts and rules (such as “game” rules). 
  •     – The pragmatic level obviously requires the consideration of players or users, as well as user groups.
  •     – It should be noted that there is no formal difference between logical inference and pragmatic inference but only a difference in use, one aiming at the truth of propositions according to referred states of things, the other calculating the rights, obligations, gains, etc. of users according to their actions and the rules of the games they play.

The semantic profiles of users and datasets will be arranged according to the three levels that have just been explained. The “place of knowledge” could be enhanced by the use of tokens or crypto-currencies to reward participation in collective intelligence. If successful, this type of medium could be generalized to other areas such as health, democratic governance, trade, etc.

Pas une pipe

This blog post offers a simple guide to the landscape of signification in language. We’ll begin by distinguishing the numerous elements that construct meaning. We’ll start by having a look at signs, and how they are everywhere in communication between living beings and how a sign is different from a symbol for instance. A symbol is a special kind of sign unique to humans, that folds into a signifier (a sound, an image, etc.) and a signified (a category or a concept). We’ll learn that the relationship between a signifier and a signified is conventional. A bit further, I’ll explain the workings of language, our most powerful symbolic system. I will review successively what grammar is: the recursive construction of sense units; semantics: the relations between these units; and pragmatics: the relations between speech, reference and social context. I’ll end this chapter by recalling some of the problems in fields of natural language processing (NLP).

Sign, symbol, language

Sign

Meaning involves at least three actors playing distinct roles. A sign (1) is a clue, a trace, an image, a message or a symbol (2) that means something (3) for someone.

A sign may be an entity or an event. What makes it a sign is not its intrinsic properties but the role it plays in meaning. For example, an individual can be the subject (thing) of a conversation, the interpreter of a conversation (being) or he can be a clue in an investigation (sign).

A thing, designated by a sign, is often called the object or referent, and – again –what makes it a referent is not its intrinsic properties but the role it plays in the triadic relation.

A being is often called the subject or the interpreter. It may be a human being, a group, an animal, a machine or whatever entity or process endowed with self-reference (by distinguishing self from the environment) and interpretation. The interpreter always takes the context into account when it interprets a sign. For example, a puppy (being) understands that a bite (sign) from its playful sibling is part of a game (thing) and may not be a real threat in the context.

Generally speaking, communication and signs exist for any living organisms. Cells can recognize concentrations of poison or food from afar, plants use their flowers to trick insects and birds into their reproductive processes. Animals – organisms with brains or nervous systems – practice complex semiotic games that include camouflage, dance and mimicries. They acknowledge, interpret and emit signs constantly. Their cognition is complex: the sensorimotor cycle involves categorization, feeling, and environmental mapping. They learn from experience, solve problems, communicate and social species manifest collective intelligence. All these cognitive properties imply the emission and interpretation of signs. When a wolf growls, no need to add a long discourse, a clear message is sent to its adversary.

Symbol

A symbol is a sign divided into two parts: the signifier and the signified. The signified (virtual) is a general category, or an abstract class, and the signifier (actual) is a tangible phenomenon that represents the signified. A signifier may be a sound, a black mark on white paper, a trace or a gesture. For example, let’s take the word “tree” as a symbol. It is made of: 1) a signifier sound voicing the word “tree”, and 2) a signified concept that means it is part of the family of perennial plants with roots, trunk, branches, and leaves. The relationship between the signifier and the signified is conventional and depends on which symbolic system the symbol belongs to (in this case, the English language). What we mean by conventional is that in most cases, there is no analogy or causal connection between the sound and the concept: for example, between the sound “crocodile” and the actual crocodile species. We use different signifiers to indicate the same signified in different languages. Furthermore, the concepts symbolized by languages depend on the environment and culture of their speakers.

The signified of the sound “tree” is ruled by the English language and not left to the choice of the interpreter. However, it is in the context of a speech act that the interlocutor understands the referent of the word: is it a syntactic tree, a palm tree, a Christmas tree…? Let’s remember this important distinction: the signified is determined by the language but the referent depends on the context.

Language

A language is a general symbolic system that allows humans to think reflexively, ask questions, tell stories, dialogue and engage in complex social interactions. English, French, Spanish, Arabic, Russian, or Mandarin are all natural languages. Each one of us is biologically equipped to speak and recognize languages. Our linguistic ability is natural, genetic, universal and embedded in our brain. By contrast, any language (like English, French, etc.) is based on a social, conventional and cultural environment; it is multiple, evolving and hybridizing. Languages mix and change according to the transformations of demographic, technological, economic, social and political contexts.

Our natural linguistic abilities multiply our cognitive faculties. They empower us with reflexive thinking, making it easy for us to learn and remember, to plan in the long-term and to coordinate large-scale endeavors. Language is also the basis for knowledge transmission between generations. Animals can’t understand, grasp or use linguistic symbols to their full extent, only humans can. Even the best-trained animals can’t evaluate if a story is false or exaggerated. Koko the famous gorilla will never ask you for an appointment for the first Tuesday of next month, nor will it communicate to you where its grandfather was born. In animal cognition, the categories that organize perception and action are enacted by neural networks. In human cognition, these categories may become explicit once symbolized and move to the forefront of our awareness. Ideas become objects of reflection. With human language comes arithmetic, art, religion, politics, economy, and technology. Compared to other social animal species, human collective intelligence is most powerful and creative when it is supported and augmented by its linguistic abilities. Therefore, when working in artificial intelligence or cognitive computing, it would be paramount to understand and model the functioning of neurons and neurotransmitters common to all animals, as well as the structure and organization of language, unique to our species.

I will now describe briefly how we shape meaning through language. Firstly, we will review what the grammatical units are (words, sentences, etc.). Secondly, we will explore the semantic networks between these units, and thirdly, what are the pragmatic interactions between language and extralinguistic realities.

Grammatical units

A natural language is made of recursively nested units: a phoneme which is an elementary sound, a word, a chain of phonemes, a syntagm, a chain of words and a text, a chain of syntagms. A language has a finite dictionary of words and syntactic rules for the construction of texts. With its dictionary and set of syntactic rules, a language offers its users the possibility to generate – and understand – an infinity of texts.

Phonemes

Humans beings can’t pronounce or recognize several phonemes simultaneously. They can only pronounce one sound at a time. So languages have to obey the constraint of sequentiality. A speech is a chain of phonemes with an acoustic punctuation reflecting its grammatical organization.

Phonemes are meaningless sounds without signification1 and generally divided into consonants and vowels. Some languages also have “click” sounding consonants (in Eastern and Southern Africa) and others (in Chinese Mandarin) use different tones on their vowels. Despite the great diversity of sounds used to pronounce human languages, the number of conventional sounds in a language is limited: the order of magnitude is between thirty and one hundred.

Words

The first symbolic grammatical unit is the word, a signifier with a signified. By word, I mean an atomic unit of meaning. For example, “small” contains one unit of meaning. But “smallest” contains two: “small” (meaning tiny) and “est” (a superlative suffix used at the end of a word indicating the most).

Languages contain nouns depicting structures or entities, and verbs describing actions, events, and processes. Depending on the language, there are other types of words like adjectives, adverbs, prepositions or sense units that orient grammatical functions, such as gender, number, grammatical person, tense and cases.

Now let’s see how many words does a language hold? It depends. The largest English dictionary counts 200,000 words, Latin has 50,000 words, Chinese 30,000 characters and biblical Hebrew amounts to 6,000 words. The French classical author Jean Racine was able to evoke the whole range of human passions and emotions by using only 3,700 words in 13 plays. Most linguists think that whatever the language is, an educated, refined speaker masters about 10,000 words in his or her lifetime.

Sentences

Note that a word alone cannot be true or false. Its signifier points to its signified (an abstract category) and not to a state of things. It is only when a sentence is spoken in a context describing a reality – a sentence with a referent – that it can be true or false.

A syntagm (a topic, sentence, and super-sentence) is a sequence of words organized by grammatical relationships. When we utter a syntagm, we leave behind the abstract dictionary of a language to enter the concrete world of speech acts in contexts. We can distinguish three sub-levels of complexity in a syntagm: the topic, the sentence, and the super-sentence. Firstly, a topic is a super-word that designates a subject, a matter, an object or a process that cannot be described by just a single word, i.e., “history of linguistics”, “smartphone” or “tourism in Canada”. Different languages have diverse rules for building topics like joining the root of a word with a grammatical case (in Latin), or agglutination of words (in German or Turkish). By relating several topics together a sentence brings to mind an event, an action or a fact, i.e., “I bought her a smartphone for her twentieth birthday”. A sentence can be verbal like in the previous example, or nominal like “the leather seat of my father’s car”. Finally, a super-sentence evokes a network of relations between facts or events, like in a theory or a narrative. The relationships between sentences can be temporal (after), spatial (behind), causal (because), logical (therefore) or underline contrasts (but, despite…), and so on.

Texts

The highest grammatical unit is a text: a punctuated sequence of syntagms. The signification of a text comes from the application of grammatical rules by combining its signifieds. The text also has a referent inferred from its temporal, spatial and social context.

In order to construct a mental model of a referent, a reader can’t help but imagine a general intention of meaning behind a text, even when it is produced by a computer program, for instance.

Semantic relationships

When we hear a speech, we are actually transforming a chain of sounds into a semantic network, and from this network, we infer a new mental model of a situation. Conversely, we are able to transform a mental model into the corresponding semantic network and then from this network, back into a sequence of phonemes. Semantics is the back and forth translation between chains of phonemes and semantic networks. Semantic networks themselves are multi-layered and can be broken down into three levels: paradigmatic, syntagmatic and textual.

hierarchy-units-any-language

Figure: Hierarchy of grammatical units and semantic relations

Paradigmatic relationships

In linguistics, a paradigm is a set of semantic relations between words of the same language. They may be etymological, taxonomical relations, oppositions or differences. These relations may be the inflectional forms of a word, like “one apple” and “two apples”. Languages may comprise paradigms to indicate verb tenses (past, present, future) or mode (active, passive). For example, the paradigm for “go” is “go, went, gone”. The notion of paradigm also indicates a set of words which cover a particular functional or thematic area. For instance, most languages include paradigms for economic actions (buy, sell, lend, repay…), or colors (red, blue, yellow…). A speaker may transform a sentence by replacing one word from a paradigm by another from the same paradigm and get a sentence that still makes sense. In the sentence “I bought a car”, you could easily replace “bought” by “sold” because “buy” and “sell” are part of the same paradigm: they have some meaning in common. But in that sentence, you can’t replace “bought” by “yellow” for instance. Two words from the same paradigm may be opposites (if you are buying, you are not selling) but still related (buying and selling can be interchangeable).

Words can also be related when they are in taxonomic relation, like “horse” and “animal”. The English dictionary describes a horse as a particular case of animal. Some words come from ancient words (etymology) or are composed of several words: for example, the word metalanguage is built from “meta” (beyond, in ancient Greek) and “language”.

In general, the conceptual relationships between words from a dictionary may be qualified as paradigmatic.

Syntagmatic relationships

By contrast, syntagmatic relations describe the grammatical connections between words in the same sentence. In the two following sentences: “The gazelle smells the presence of the lion” and “The lion smells the presence of the gazelle”, the set of words are identical but the words “gazelle” and “lion” do not share the same grammatical role. Since those words are inversed in the syntagmatic structure, the sentences have distinct meanings.

Textual relationships

At the text level, which includes several syntagms, we find semantic relations like anaphoras and isotopies. Let’s consider the super-sentence: “If a man has talent and can’t use it, he’s failed.” (Thomas Wolfe). In this quotation “it” is an anaphora for “talent” and “he”, an anaphora for “a man”. When reading a pronoun (it, he), we resolve the anaphora when we know which noun – mentioned in a previous or following sentence – it is referring to. On the other hand, isotopies are recurrences of themes that weave the unity of a text: the identity of heroes (characters), genres (love stories or historical novels), settings, etc. The notion of isotopy also encompasses repetitions that help the listener understand a text.

Pragmatic interactions

Pragmatics weave the triadic relation between signs (symbols, speeches or texts), beings (interpreters, people or interlocutors) and things (referents, objects, reality, extra-textual context). On the pragmatic level of communication, speeches point to – and act upon – a social context. A speech act functions as a move in a game played by its speaker. So, distinct from semantic meaning, that we have analyzed in a previous section, pragmatic meaning would address questions like: what kind of act (an advice, a promise, a blame, a condemnation, etc.) is carried by a speech? Is a speech spoken in a play on a stage or in a real tribunal? The pragmatic meaning of a speech also relates to the actual effects of its utterance, effects that are not always known at the moment of the enunciation. For example: “Did I convince you? Have you kept your word?”. The sense of a speech can only be understood after its utterance and future events can always modify it.

A speech act is highly dependent on cultural conventions, on the identity of speakers and attendees, time and place, etc. By proclaiming: “The session is open”, I am not just announcing that an official meeting is about to start, I am actually opening the session. But I have to be someone relevant or important like the president of that assembly to do so. If I am a janitor and I say: “The session is open”, the act is not performed because I don’t have any legitimacy to open the session.

If an utterance is descriptive, it’s either true or false. In other cases, if an utterance does something instead of describing a state of things, it has a pragmatic force instead of a truth value.

Resolving ambiguities

We have just reviewed the different layers of grammatical, semantic and pragmatic complexity to better understand the meaning of a text. Now, we are going to examine the ambiguities that may arise during the reading or listening of a text in a natural language.

Semantic ambiguities

How do we go from to the sound of a chain of phonemes to the understanding of a text? From a sequence of sounds, we build a multi-layered (paradigmatic, syntagmatic and textual) semantic network. When weaving the paradigmatic layer, we answer questions like: “What is this word? To what paradigm does it belong? Which one of its meanings should I consider?”. Then, we connect words together by answering: “What are the syntagmatic relations between the words in that sentence?”. Finally, we comprehend the text by recognizing the anaphoras and isotopies that connect its sentences. Our understanding of a text is based on this three-layered network of sense units.

Furthermore, ambiguities or uncertainties of meaning in languages can happen on all three levels and can multiply their effects. In the case of homophony, the same sound can point to different words like in “ate” and “eight”. And sometimes, the same word may convey several distinct meanings like in “mole”: (1) a shortsighted mouse-like animal digging underground galleries, (2) an undercover spy, or (3) a pigmented spot or mark on the skin. In the case of synonymy, the same meaning can apply to distinct words like “tiny” and “small”. Amphibologies refer to syntagmatic ambiguities as in: “Mary saw a woman on the mountain with a telescope.” Who is on the mountain? Moreover, who has the telescope? Mary or the woman? On a higher level of complexity, textual relations can be even more ambiguous than paradigmatic and syntagmatic ones because rules for anaphoras and isotopies are loosely defined.

Resolving semantic ambiguities in pragmatic contexts

Human beings don’t always correctly resolve all the semantic ambiguities of a speech, but when they do, it is often because they take into account the pragmatic (or extra-textual) context that is generally implicit. It’s in a context, that deictic symbols like: here, you, me, that one over there, or next Tuesday, take their full meaning. Let’s add that, comparing a text in hand with the author’s corpus, genre, historical period, helps to better discern the meaning of a text. But some pragmatic aspects of a text may remain unknown. Ambiguities can stem from many causes: the precise referents of a speech, the uncertainty of the speaker’s social interactions, the ambivalence or concealment of the speaker’s intentions, and of course not knowing in advance the effects of an utterance.

Problems in natural language processing

Computer programs can’t understand or translate texts with dictionaries and grammars alone. They can’t engage in the pragmatic context of speeches like human beings do to disambiguate texts unless this context is made explicit. Understanding a text implies building and comparing complex and dynamic mental models of text and context.

On the other hand, natural language processing (a sub-discipline of artificial intelligence) compensates for the irregularity of natural languages by using a lot of statistical calculations and deep learning algorithms that have been trained on huge corpora. Depending on its training set, an algorithm can interpret a text by choosing the most probable semantic network amongst those compatible within a chain of phonemes. Imperatively, the results have to be validated and improved by human reviewers.

I put forward in this paper a vision for a new generation of cloud-based public communication service designed to foster reflexive collective intelligence. I begin with a description of the current situation, including the huge power and social shortcomings of platforms like Google, Apple, Facebook, Amazon, Microsoft, Alibaba, Baidu, etc. Contrasting with the practice of these tech giants, I reassert the values that are direly needed at the foundation of any future global public sphere: openness, transparency and commonality. But such ethical and practical guidelines are probably not powerful enough to help us crossing a new threshold in collective intelligence. Only a disruptive innovation in cognitive computing will do the trick. That’s why I introduce “deep meaning” a new research program in artificial intelligence, based on the Information Economy  MetaLanguage (IEML). I conclude this paper by evoking possible bootstrapping scenarii for the new public platform.

The rise of platforms

At the end of the 20th century, one percent of the human population was connected to the Internet. In 2017, more than half the population is connected. Most of the users interact in social media, search information, buy products and services online. But despite the ongoing success of digital communication, there is a growing dissatisfaction about the big tech companies – the “Silicon Valley” – who dominate the new communication environment.

The big techs are the most valued companies in the world and the massive amount of data that they possess is considered the most precious good of our time. Silicon Valley owns the big computers: the network of physical centers where our personal and business data are stored and processed. Their income comes from their economic exploitation of our data for marketing purposes and from their sales of hardware, software or services. But they also derive considerable power from the knowledge of markets and public opinions that stems from their information control.

The big cloud companies master new computing techniques mimicking neurons when they learn a new behavior. These programs are marketed as deep learning or artificial intelligence even if they have no cognitive autonomy and need some intense training by humans before becoming useful. Despite their well known limitations, machine learning algorithms have effectively augmented the abilities of digital systems. Deep learning is now used in every economic sector. Chips specialized in deep learning are found in big data centers, smartphones, robots and autonomous vehicles. As Vladimir Putin rightly told young Russians in his speech for the first day of school in fall 2017: “Whoever becomes the leader in this sphere [of artificial intelligence] will become the ruler of the world”.

The tech giants control huge business ecosystems beyond their official legal borders and they can ruin or buy competitors. Unfortunately, the big tech rivalry prevents a real interoperability between cloud services, even if such interoperability would be in the interest of the general public and of many smaller businesses. As if their technical and economic powers were not enough, the big tech are now playing into the courts of governments. Facebook warrants our identity and warns our family and friends that we are safe when a terrorist attack or a natural disaster occurs. Mark Zuckerberg states that one of Facebook’s mission is to insure that the electoral process is fair and open in democratic countries. Google Earth and Google Street View are now used by several municipal instances and governments as their primary source of information for cadastral plans and other geographical or geospatial services. Twitter became an official global political, diplomatic and news service. Microsoft sells its digital infrastructure to public schools. The kingdom of Denmark opened an official embassy in Silicon Valley. Cryptocurrencies independent from nation states (like Bitcoin) are becoming increasingly popular. Blockchain-based smart contracts (powered by Ethereum) bypass state authentication and traditional paper bureaucracies. Some traditional functions of government are taken over by private technological ventures.

This should not come as a surprise. The practice of writing in ancient palace-temples gave birth to government as a separate entity. Alphabet and paper allowed the emergence of merchant city-states and the expansion of literate empires. The printing press, industrial economy, motorized transportation and electronic media sustained nation-states. The digital revolution will foster new forms of government. Today, we discuss political problems in a global public space taking advantage of the web and social media and the majority of humans live in interconnected cities and metropoles. Each urban node wants to be an accelerator of collective intelligence, a smart city. We need to think about public services in a new way. Schools, universities, public health institutions, mail services, archives, public libraries and museums should take full advantage of the internet and de-silo their datasets. But we should go further. Are current platforms doing their best to enhance collective intelligence and human development? How about giving back to the general population the data produced in social media and other cloud services, instead of just monetizing it for marketing purposes ? How about giving to the people access to cognitive powers unleashed by an ubiquitous algorithmic medium?

Information wants to be open, transparent and common

We need a new kind of public sphere: a platform in the cloud where data and metadata would be our common good, dedicated to the recording and collaborative exploitation of memory in the service of our collective intelligence. The core values orienting the construction of this new public sphere should be: openness, transparency and commonality

Firstly openness has already been experimented in the scientific community, the free software movement, the creative commons licensing, Wikipedia and many more endeavors. It has been adopted by several big industries and governments. “Open by default” will soon be the new normal. Openness is on the rise because it maximizes the improvement of goods and services, fosters trust and supports collaborative engagement. It can be applied to data formats, operating systems, abstract models, algorithms and even hardware. Openness applies also to taxonomies, ontologies, search architectures, etc. A new open public space should encourage all participants to create, comment, categorize, assess and analyze its content.

Then, transparency is the very ground for trust and the precondition of an authentic dialogue. Data and people (including the administrators of a platform), should be traceable and audit-able. Transparency should be reciprocal, without distinction between the rulers and the ruled. Such transparency will ultimately be the basis for reflexive collective intelligence, allowing teams and communities of any size to observe and compare their cognitive activity

Commonality means that people will not have to pay to get access to this new public sphere: all will be free and public property. Commonality means also transversality: de-silo and cross-pollination. Smart communities will interconnect and recombine all kind of useful information: open archives of libraries and museums, free academic publications, shared learning resources, knowledge management repositories, open-source intelligence datasets, news, public legal databases…

From deep learning to deep meaning

This new public platform will be based on the web and its open standards like http, URL, html, etc. Like all current platforms, it will take advantage of distributed computing in the cloud and it will use “deep learning”: an artificial intelligence technology that employs specialized chips and algorithms that roughly mimic the learning process of neurons. Finally, to be completely up to date, the next public platform will enable blockchain-based payments, transactions, contracts and secure records

If a public platform offers the same technologies as the big tech (cloud, deep learning, blockchain), with the sole difference of openness, transparency and commonality, it may prove insufficient to foster a swift adoption, as is demonstrated by the relative failures of Diaspora (open Facebook) and Mastodon (open Twitter). Such a project may only succeed if it comes up with some technical advantage compared to the existing commercial platforms. Moreover, this technical advantage should have appealing political and philosophical dimensions.

No one really fancies the dream of autonomous machines, specially considering the current limitations of artificial intelligence. Instead, we want an artificial intelligence designed for the augmentation of human personal and collective intellect. That’s why, in addition to the current state of the art, the new platform will integrate the brand new deep meaning technology. Deep meaning will expand the actual reach of artificial intelligence, improve the user experience of big data analytics and allow the reflexivity of personal and collective intelligence.

Language as a platform

In a nutshell, deep learning models neurons and deep meaning models language. In order to augment the human intellect, we need both! Right now deep learning is based on neural networks simulation. It is enough to model roughly animal cognition (every animal species has neurons) but it is not refined enough to model human cognition. The difference between animal cognition and human cognition is the reflexive thinking that comes from language, which adds a layer of semantic addressing on top of neural connectivity. Speech production and understanding is an innate property of individual human brains. But as humanity is a social species, language is a property of human societies. Languages are conventional, shared by members of the same culture and learned by social contact. In human cognition, the categories that organize perception, action, memory and learning are expressed linguistically so they may be reflected upon and shared in conversations. A language works like the semantic addressing system of a social virtual database.

But there is a problem with natural languages (english, french, arabic, etc.), they are irregular and do not lend themselves easily to machine understanding or machine translation. The current trend in natural language processing, an important field of artificial intelligence, is to use statistical algorithms and deep learning methods to understand and produce linguistic data. But instead of using statistics, deep meaning adopts a regular and computable metalanguage. I have designed IEML (Information Economy MetaLanguage) from the beginning to optimize semantic computing. IEML words are built from six primitive symbols and two operations: addition and multiplication. The semantic relations between IEML words follow the lines of their generative operations. The total number of words do not exceed 10 000. From its dictionary, the generative grammar of IEML allows the construction of sentences at three layers of complexity: topics are made of words, phrases (facts, events) are made of topics and super-phrases (theories, narratives) are made of phrases. The higher meaning unit, or text, is a unique set of sentences. Deep meaning technology uses IEML as the semantic addressing system of a social database.

Given large datasets, deep meaning allows the automatic computing of semantic relations between data, semantic analysis and semantic visualizations. This new technology fosters semantic interoperability: it decompartmentalizes tags, folksonomies, taxonomies, ontologies and languages. When on line communities categorize, assess and exchange semantic data, they generate explorable ecosystems of ideas that represent their collective intelligence. Take note that the vision of collective intelligence proposed here is distinct from the “wisdom of the crowd” model, that assumes independent agents and excludes dialogue and reflexivity. Just the opposite : deep meaning was designed from the beginning to nurture dialogue and reflexivity.

The main functions of the new public sphere

deepmeaning

In the new public sphere, every netizen will act as an author, editor, artist, curator, critique, messenger, contractor and gamer. The next platform weaves five functions together: curation, creation, communication, transaction and immersion.

By curation I mean the collaborative creation, edition, analysis, synthesis, visualization, explanation and publication of datasets. People posting, liking and commenting content on social media are already doing data curation, in a primitive, simple way. Active professionals in the fields of heritage preservation (library, museums), digital humanities, education, knowledge management, data-driven journalism or open-source intelligence practice data curation in a more systematic and mindful manner. The new platform will offer a consistent service of collaborative data curation empowered by a common semantic addressing system.

Augmented by deep meaning technology, our public sphere will include a semantic metadata editor applicable to any document format. It will work as a registration system for the works of the mind. Communication will be ensured by a global Twitter-like public posting system. But instead of the current hashtags that are mere sequences of characters, the new semantic tags will self-translate in all natural languages and interconnect by conceptual proximity. The blockchain layer will allow any transaction to be recorded. The platform will remunerate authors and curators in collective intelligence coins, according to the public engagement generated by their work. The new public sphere will be grounded in the internet of things, smart cities, ambient intelligence and augmented reality. People will control their environment and communicate with sensors, software agents and bots of all kinds in the same immersive semantic space. Virtual worlds will simulate the collective intelligence of teams, networks and cities.

Bootstrapping

This IEML-based platform has been developed between 2002 and 2017 at the University of Ottawa. A prototype is currently in a pre-alpha version, featuring the curation functionality. An alpha version will be demonstrated in the summer of 2018. How to bridge the gap from the fundamental research to the full scale industrial platform? Such endeavor will be much less expensive than the conquest of space and could bring a tremendous augmentation of human collective intelligence. Even if the network effect applies obviously to the new public space, small communities of pioneers will benefit immediately from its early release. On the humanistic side, I have already mentioned museums and libraries, researchers in humanities and social science, collaborative learning networks, data-oriented journalists, knowledge management and business intelligence professionals, etc. On the engineering side, deep meaning opens a new sub-field of artificial intelligence that will enhance current techniques of big data analytics, machine learning, natural language processing, internet of things, augmented reality and other immersive interfaces. Because it is open source by design, the development of the new technology can be crowdsourced and shared easily among many different actors.

Let’s draw a distinction between the new public sphere, including its semantic coordinate system, and the commercial platforms that will give access to it. This distinction being made, we can imagine a consortium of big tech companies, universities and governments supporting the development of the global public service of the future. We may also imagine one of the big techs taking the lead to associate its name to the new platform and developing some hardware specialized in deep meaning. Another scenario is the foundation of a company that will ensure the construction and maintenance of the new platform as a free public service while sustaining itself by offering semantic services: research, consulting, design and training. In any case, a new international school must be established around a virtual dockyard where trainees and trainers build and improve progressively the semantic coordinate system and other basic models of the new platform. Students from various organizations and backgrounds will gain experience in the field of deep meaning and will disseminate the acquired knowledge back into their communities.

Emission de radio (Suisse romande), 25 minutes en français.

Sémantique numérique et réseaux sociaux. Vers un service public planétaire, 1h en français

You-Tube Video (in english) 1h

 

 

What is IEML?

  • IEML (Information Economy MetaLanguage) is an open (GPL3) and free artificial metalanguage that is simultaneously a programming language, a pivot between natural languages and a semantic coordinate system. When data are categorized in IEML, the metalanguage compute their semantic relationships and distances.
  • From a “social” point of view, on line communities categorizing data in IEML generate explorable ecosystems of ideas that represent their collective intelligence.
  • Github.

What problems does IEML solve?

  • Decompartmentalization of tags, folksonomies, taxonomies, ontologies and languages (french and english for now).
  • Semantic search, automatic computing and visualization of semantic relations and distances between data.
  • Giving back to the users the information that they produce, enabling reflexive collective intelligence.

Who is IEML for?

Content curators

  • knowledge management
  • marketing
  • curation of open data from museums and libraries, crowdsourced curation
  • education, collaborative learning, connectionists MOOCs
  • watch, intelligence

Self-organizing on line communities

  • smart cities
  • collaborative teams
  • communities of practice…

Researchers

  • artificial intelligence
  • data analytics
  • humanities and social sciences, digital humanities

What motivates people to adopt IEML?

  • IEML users participate in the leading edge of digital innovation, big data analytics and collective intelligence.
  • IEML can enhance other AI techniques like machine learning, deep learning, natural language processing and rule-based inference.

IEML tools

IEML v.0

IEML v.0 includes…

  • A dictionary of  concepts whose edition is restricted to specialists but navigation and use is open to all.
  • A library of tags – called USLs (Uniform Semantic Locators) – whose edition, navigation and use is open to all.
  • An API allowing access to the dictionary, the library and their functionalities (semantic computing).

Intlekt v.0

Intlekt v.0 is a collaborative data curation tool that allows
– the categorization of data in IEML,
– the semantic visualization of collections of data categorized in IEML
– the publication of these collections

The prototype (to be issued in May 2018) will be mono-user but the full blown app will be social.

Who made it?

The IEML project is designed and led by Pierre Lévy.

It has been financed by the Canada Research Chair in Collective Intelligence at the University of Ottawa (2002-2016).

At an early stage (2004-2011) Steve Newcomb and Michel Biezunski have contributed to the design and implementation (parser, dictionary). Christian Desjardins implemented a second version of the dictionary. Andrew Roczniak helped for the first mathematical formalization, implemented a second version of the parser and a third version of the dictionary (2004-2016).

The 2016 version has been implemented by Louis van Beurden, Hadrien Titeux (chief engineers), Candide Kemmler (project management, interface), Zakaria Soliman and Alice Ribaucourt.

The 2017 version (1.0) has been implemented by Louis van Beurden (chief engineer), Eric Waldman (IEML edition interface, visualization), Sylvain Aube (Drupal), Ludovic Carré and Vincent Lefoulon (collections and tags management).

dice-1-600x903

Dice sculpture by Tony Cragg

Après avoir posé dans un post précédent les principes d’une cartographie de l’intelligence collective, je m’intéresse maintenant au développement humain qui en est le corrélat, la condition et l’effet de l’intelligence collective. Dans un premier temps, je vais élever au carré la triade sémiotique signe/être/chose (étoile/visage/cube) pour obtenir les neuf «devenirs», qui pointent vers les principales directions du développement humain.

F-PARA-devenirs-1.jpgCarte des devenirs

Les neuf chemins qui mènent de l’un des trois pôles sémiotiques vers lui-même ou vers les deux autres sont appelés en IEML des devenirs (voir dans le dictionnaire IEML la carte sémantique M:M:.) Un devenir ne peut être réduit ni à son point de départ ni à son point d’arrivée, ni à la somme des deux mais bel et bien à l’entre-deux ou à la métamorphose de l’un dans l’autre. Ainsi la mémoire signifie ultimement «devenir chose du signe». On remarquera également que chacun des neufs devenirs peut se tourner aussi bien vers l’actuel que vers le virtuel. Par exemple, la pensée peut prendre comme objet aussi bien le réel sensible que ses propres spéculations. A l’autre bout du spectre, l’espace peut référer aussi bien au contenant de la matérialité physique qu’aux idéalités de la géométrie. Au cours de notre exploration, nous allons découvrir que chacun des neufs devenirs indique une direction d’exploration possible de la philosophie. Les neuf devenirs sont à la fois conceptuellement distincts et réellement interdépendants puisque chacun d’eux a besoin du soutien des autres pour se déployer.

Pensée

Dans la pensée – s. en IEML – aussi bien la substance (point de départ) que l’attribut (point d’arrivée) sont des signes. La pensée relève en quelque sorte du signe au carré. Elle marque la transformation d’un signe en un autre signe, comme dans la déduction, l’induction, l’interprétation, l’imagination et ainsi de suite.

Le concept de pensée ou d’intellection est central pour la tradition idéaliste occidentale qui part de Platon et passe notamment par Aristote, les néo-plationciens, les théologiens du moyen-Age, Kant, Hegel et jusqu’à Husserl. L’intellection se trouve également au coeur de la philosophie islamique, aussi bien chez Avicenne (Ibn Sina) et ses contituateurs dans la philosophie iranienne jusqu’au XVIIe siècle que chez l’andalou Averroes (Ibn Roshd). Elle l’est encore pour la plupart des grandes philosophies de l’Inde méditante. L’existence humaine, et plus encore l’existence philosophique, est nécessairement plongée dans la pensée discursive réfléchissante. Où cette pensée prend-elle son origine ? Quelles sont ses structures ? Comment mener la pensée humaine à sa perfection ? Autant de questions que l’interrogation philosophique ne peut éluder.

Langage

Le langage – b. en IEML – s’entend ici comme un code (au sens le plus large du terme) de communication qui fonctionne effectivement dans l’univers humain. Le langage est un «devenir-être du signe», une transformation du signe en intelligence, une illumination du sujet par le signe.

Certaines philosophies adoptent comme point de départ les problèmes du langage et de la communication. Wittgenstein, par exemple, a fait largement tourner sa philosophie autour du problème des limites du langage. Mais il faut noter qu’il s’intéresse également à des questions de logique et au problème de la vérité. Dans un style différent, un philosophe comme Peirce n’a cessé d’approfondir la question de la signification et du fonctionnement des signes. Austin a creusé le thème des actes de langage, etc. On comprend que ce devenir désigne le moment sémiotique (ou linguistique) de la philosophie. L’Homme est un être parlant dont l’existence ne peut se réaliser que par et dans le langage.

Mémoire

Dans la mémoire – t. en IEML – le signe en substance se réifie dans son attribut chose. Ce devenir évoque le geste élémentaire de l’inscription ou de l’enregistrement. Le devenir chose du signe est ici considéré comme la condition de possibilité de la mémoire. Il commande la notion même de temps.

Le passage du temps et son inscription – la mémoire – fut un des thèmes de prédilection de Bergson (auteur notamment de Matière et Mémoire). Bergson mettait l’épaisseur de la vie et le jaillissement évolutif de la création du côté de la mémoire par opposition avec le déterminisme physicien du XIXe siècle (la « matière ») et le mécanisme logico-mathématique, assignés à l’espace. On trouve également une analyse fine du passage du temps et de son inscription dans les philosophies de l’impermanence et du karma, comme le bouddhisme. L’évolutionnisme, de manière générale, qu’il soit cosmique, biologique ou culturel, se fonde sur une dialectique du passage du temps et de la rétention d’une mémoire codée. Notons enfin que nombre de grandes traditions religieuses se fondent sur des écritures sacrées relevant du même archétype de l’inscription. En un sens, parce que nous sommes inévitablement soumis à la séquentialité temporelle, notre existence est mémoire : mémoire à court terme de la perception, mémoire à long terme du souvenir et de l’apprentissage, mémoire individuelle où revivent et confluent les mémoires collectives.

Société

Dans la société – k. en IEML –, une communauté d’êtres s’organise au moyen de signes. Nous nous engageons dans des promesses et des contrats. Nous obéïssons à la loi. Les membres d’un clan ont le même animal totémique. Nous nous battons sous le même drapeau. Nous échangeons des biens économiques en nous mettant d’accord sur leur valeur. Nous écoutons ensemble de la musique et nous partageons la même langue. Dans tous ces cas, comme dans bien d’autres, une communauté d’humains converge et crée une unité sociale en s’attachant à une même réalité signifiante conventionnelle : autant de manières de « faire société ».

On sait que la sociologie est un rejeton de la philosophie. Avant même que la discipline sociologique ne se sépare du tronc commun, le moment social de la philosophie a été illustré par de grands noms : Jean-Jacques Rousseau et sa théorie du contrat, Auguste Comte qui faisait culminer la connaissance dans la science des sociétés, Karl Marx qui faisait de la lutte des classes le moteur de l’histoire et ramenait l’économie, la politique et la culture en général aux « rapports sociaux réels ». Durkheim, Mauss, Weber et leurs successeurs sociologues et anthropologues se sont interrogé sur les mécanismes par lesquels nous « faisons société ». L’homme est un animal politique qui ne peut pas ne pas vivre en société. Comment vivifier la philia, lien d’amitié entre les membres de la même communauté ? Quelles sont les vraies ou les bonnes sociétés ? Spirituelles, cosmopolites, impériales, civiques, nationales…? Quels sont les meilleurs régimes politiques ? Autant d’interrogations toujours ouvertes.

Affect

Dans l’affect – m. en IEML – un être s’oriente vers d’autres êtres, ou détermine son intériorité la plus intime. L’affect est ici entendu comme le tropisme de la subjectivité. Désir, amour, haine, indifférence, compassion, équanimité sont des qualités émotionnelles qui circulent entre les êtres.

Après les poètes, les dévots et les comédiens, Freud, la psychanalyse et une bonne part de la psychologie clinique insistent sur l’importance de l’affect et des fonctions émotionnelles pour comprendre l’existence humaine. On a beaucoup souligné récemment l’importance de « l’intelligence émotionnelle ». Mais la chose n’est pas nouvelle. Cela fait bien longtemps que les philosophes s’interrogent sur l’amour (voir le Banquet de Platon) et les passions (Descartes lui-même a écrit un Traité des passions), même s’il n’en font pas toujours le thème central de leur philosophie. L’existence se débat nécessairement dans les problèmes affectifs parce qu’aucune vie humaine ne peut échapper aux émotions, à l’attraction et à la répulsion, à la joie et à la tristesse. Mais les émotions sont-elles des expressions légitimes de notre nature spontanée ou des «poisons de l’esprit» (selon la forte expression bouddhiste) auxquels il ne faut pas laisser le gouvernement de notre existence ? Ou les deux ? De nombreuses écoles philosophiques aussi bien Orient qu’en Occident, ont vanté l’ataraxie, le calme mental ou, tout au moins, la modération des passions. Mais comment maîtriser les passions, et comment les maîtriser sans les connaître ?

Monde

Dans le monde – n. en IEML – les êtres humains (être en substance) s’expriment dans leur environnement physique (chose en attribut). Ils habitent cet environnement, ils le travaillent au moyen d’outils, ils en nomment les parties et les objets, leur attribuent des valeurs. C’est ainsi que se construit un monde culturellement ordonné, un cosmos.

Nietzsche (qui accordait un rôle central à la création des valeurs), tout comme la pensée anthropologique, fondent principalement leur approche sur le concept de « monde », ou de cosmos organisé par la culture humaine. La notion indienne tout-englobante de dharma se réfère ultimement à un ordre cosmique transcendant qui veut se manifester jusque dans les plus petits détails de l’existence. L’interrogation philosophique sur la justice rejoint cette idée que les actes humains sont en résonance ou en dissonance avec un ordre universel. Mais quelle est la « voie » (le Dao de la philosophie chinoise) de cet ordre ? Son universalité est-elle naturelle ou conventionnelle ? A quels principes obeit-elle ?

Vérité

La vérité – d. en IEML – décrit un « devenir signe de la chose ». Une référence (un état de chose) se manifeste par un message déclaratif (un signe). Un énoncé n’est vrai que s’il contient une description correcte d’un état de choses. L’authenticité se dit d’un signe qui garantit une chose.

La tradition logicienne et la philosophie analytique s’intéressent principalement au concept de vérité (au sens de l’exactitude des faits et des raisonnements) ainsi qu’aux problèmes liés à la référence. L’épistémologie et les sciences cognitives qui se situent dans cette mouvance mettent au fondement de leur démarche la construction d’une connaissance vraie. Mais, au-delà de ces spécialisations, la question de la vérité est un point de passage obligé de l’interrogation philosophique. Même les plus sceptiques ne peuvent renoncer à la vérité sans renoncer à leur propre scepticisme. Si l’on veut mettre l’accent sur sa stabilité et sa cohérence, on la fera découler des lois de la logique et de procédures rigoureuses de vérification empirique. Mais si l’on veut mettre l’accent sur sa fragilité et sa multiplicité, on la fera sécréter par des paradigmes (au sens de Khun), des épistémès, des constructions sociales de sens, toutes variables selon les temps et les lieux.

Vie

Dans la vie – f. en IEML – une chose substantielle (la matérialité du corps) prend l’attribut de l’être, avec sa qualité d’intériorité subjective. La vie évoque ainsi l’incarnation physique d’une créature sensible. Quand un être vivant mange et boit, il transforme des entités objectivées en matériaux et combustibles pour les processus organiques qui supportent sa subjectivité : devenir être de la chose.

Les empiristes fondent la connaissance sur les sens. Les phénoménologues analysent notamment la manière dont les choses nous apparaissent dans la perception. Le biologisme ramène le fonctionnement de l’esprit à celui des neurones ou des hormones. Autant de traditions et de points de vue qui, malgré leurs différences, convergent sur l’organisme humain, ses fonctions et sa sensibilité. Beaucoup de grands philosophes furent des biologistes (Aristote, Darwin) ou des médecins (Hippocrate, Avicenne, Maïmonide…). Médecine chinoise et philosophie chinoise sont profondément interreliées. Il est indéniable que l’existence humaine émane d’un corps vivant et que tous les événements de cette existence s’inscrivent d’une manière ou d’une autre dans ce corps.

Espace

Dans l’espace – l. en IEML –, qu’il soit concret ou abstrait, une chose se relie aux autres choses, se manifeste dans l’univers des choses. L’espace est un système de transformation des choses. Il se construit de relations topologiques et de proximités géométriques, de territoires, d’enveloppes, de limites et de chemins, de fermetures et de passages. L’espace manifeste en quelque sorte l’essence superlative de la chose, comme la pensée manifestait celle du signe et l’affect celle de l’être.

Sur un plan philosophique, les géomètres, topologues, atomistes, matérialistes et physiciens fondent leurs conceptions sur l’espace. Comme je le soulignais plus haut, le géométrisme idéaliste ou l’atomisme matérialiste se rejoignent sur l’importance fondatrice de l’espace. Les atomes sont dans le vide, c’est-à-dire dans l’espace. L’existence humaine se projette nécessairement dans la multitude spatiale qu’elle construit et qu’elle habite : géographies physiques ou imaginaires, paysages urbains ou ruraux, architectures de béton ou de concepts, distances géométriques ou connexions topologiques, replis et réseaux à l’infini.

On peut ainsi caractériser les philosophies en fonction du ou des devenirs qu’elles prennent pour point de départ de leur démarche ou qui constituent leur thème de prédilection. Les devenirs IEML représentent des « points de passage obligé » de l’existence. Dès son alphabet, le métalangage ouvre la sphère sémantique à l’expression de n’importe quelle philosophie, exactement comme une langue naturelle. Mais c’est aussi une langue philosophique, conçue pour éviter les zones cognitives aveugles, les réflexes de pensée limitants dus à l’usage exclusif d’une seule langue naturelle, à la pratique d’une seule discipline devenue seconde nature ou à des points de vue philosophiques trop exclusifs. Elle a justement été construite pour favoriser la libre exploration de toutes les directions sémantiques. C’est pourquoi, en IEML, chaque philosophie apparaît comme une combinaison de points de vue partiels sur une sphère sémantique intégrale qui peut les accommoder toutes et les entrelace dans sa circularité radicale.

Ancient-Hands-Argentina

Proper quotation: « The Philosophical Concept of Algorithmic Intelligence », Spanda Journal special issue on “Collective Intelligence”, V (2), December 2014, p. 17-25. The original text can be found for free online at  Spanda

“Transcending the media, airborne machines will announce the voice of the many. Still indiscernible, cloaked in the mists of the future, bathing another humanity in its murmuring, we have a rendezvous with the over-language.” Collective Intelligence, 1994, p. xxviii.

Twenty years after Collective Intelligence

This paper was written in 2014, twenty years after L’intelligence collective [the original French edition of Collective Intelligence].[2] The main purpose of Collective Intelligence was to formulate a vision of a cultural and social evolution that would be capable of making the best use of the new possibilities opened up by digital communication. Long before the success of social networks on the Web,[3] I predicted the rise of “engineering the social bond.” Eight years before the founding of Wikipedia in 2001, I imagined an online “cosmopedia” structured in hypertext links. When the digital humanities and the social media had not even been named, I was calling for an epistemological and methodological transformation of the human sciences. But above all, at a time when less than one percent of the world’s population was connected,[4] I was predicting (along with a small minority of thinkers) that the Internet would become the centre of the global public space and the main medium of communication, in particular for the collaborative production and sharing of knowledge and the dissemination of news.[5] In spite of the considerable growth of interactive digital communication over the past twenty years, we are still far from the ideal described in Collective Intelligence. It seemed to me already in 1994 that the anthropological changes under way would take root and inaugurate a new phase in the human adventure only if we invented what I then called an “over-language.” How can communication readily reach across the multiplicity of dialects and cultures? How can we map the deluge of digital data, order it around our interests and extract knowledge from it? How can we master the waves, currents and depths of the software ocean? Collective Intelligence envisaged a symbolic system capable of harnessing the immense calculating power of the new medium and making it work for our benefit. But the over-language I foresaw in 1994 was still in the “indiscernible” period, shrouded in “the mists of the future.” Twenty years later, the curtain of mist has been partially pierced: the over-language now has a name, IEML (acronym for Information Economy MetaLanguage), a grammar and a dictionary.[6]

Reflexive collective intelligence

Collective intelligence drives human development, and human development supports the growth of collective intelligence. By improving collective intelligence we can place ourselves in this feedback loop and orient it in the direction of a self-organizing virtuous cycle. This is the strategic intuition that has guided my research. But how can we improve collective intelligence? In 1994, the concept of digital collective intelligence was still revolutionary. In 2014, this term is commonly used by consultants, politicians, entrepreneurs, technologists, academics and educators. Crowdsourcing has become a common practice, and knowledge management is now supported by the decentralized use of social media. The interconnection of humanity through the Internet, the development of the knowledge economy, the rush to higher education and the rise of cloud computing and big data are all indicators of an increase in our cognitive power. But we have yet to cross the threshold of reflexive collective intelligence. Just as dancers can only perfect their movements by reflecting them in a mirror, just as yogis develop awareness of their inner being only through the meditative contemplation of their own mind, collective intelligence will only be able to set out on the path of purposeful learning and thus move on to a new stage in its growth by achieving reflexivity. It will therefore need to acquire a mirror that allows it to observe its own cognitive processes. Be careful! Collective intelligence does not and will not have autonomous consciousness: when I talk about reflexive collective intelligence, I mean that human individuals will have a clearer and better-shared knowledge than they have today of the collective intelligence in which they participate, a knowledge based on transparent principles and perfectible scientific methods.

The key: A complete modelling of language

But how can a mirror of collective intelligence be constructed? It is clear that the context of reflection will be the algorithmic medium or, to put it another way, the Internet, the calculating power of cloud computing, ubiquitous communication and distributed interactive mobile interfaces. Since we can only reflect collective intelligence in the algorithmic medium, we must yield to the nature of that medium and have a calculable model of our intelligence, a model that will be fed by the flows of digital data from our activities. In short, we need a mathematical (with calculable models) and empirical (based on data) science of collective intelligence. But, once again, is such a science possible? Since humanity is a species that is highly social, its intelligence is intrinsically social, or collective. If we had a mathematical and empirical science of human intelligence in general, we could no doubt derive a science of collective intelligence from it. This leads us to a major problem that has been investigated in the social sciences, the human sciences, the cognitive sciences and artificial intelligence since the twentieth century: is a mathematized science of human intelligence possible? It is language or, to put it another way, symbolic manipulation that distinguishes human cognition. We use language to categorize sensory data, to organize our memory, to think, to communicate, to carry out social actions, etc. My research has led me to the conclusion that a science of human intelligence is indeed possible, but on the condition that we solve the problem of the mathematical modelling of language. I am speaking here of a complete scientific modelling of language, one that would not be limited to the purely logical and syntactic aspects or to statistical correlations of corpora of texts, but would be capable of expressing semantic relationships formed between units of meaning, and doing so in an algebraic, generative mode.[7] Convinced that an algebraic model of semantics was the key to a science of intelligence, I focused my efforts on discovering such a model; the result was the invention of IEML.[8] IEML—an artificial language with calculable semantics—is the intellectual technology that will make it possible to find answers to all the above-mentioned questions. We now have a complete scientific modelling of language, including its semantic aspects. Thus, a science of human intelligence is now possible. It follows, then, that a mathematical and empirical science of collective intelligence is possible. Consequently, a reflexive collective intelligence is in turn possible. This means that the acceleration of human development is within our reach.

The scientific file: The Semantic Sphere

I have written two volumes on my project of developing the scientific framework for a reflexive collective intelligence, and I am currently writing the third. This trilogy can be read as the story of a voyage of discovery. The first volume, The Semantic Sphere 1 (2011),[9] provides the justification for my undertaking. It contains the statement of my aims, a brief intellectual autobiography and, above all, a detailed dialogue with my contemporaries and my predecessors. With a substantial bibliography,[10] that volume presents the main themes of my intellectual process, compares my thoughts with those of the philosophical and scientific tradition, engages in conversation with the research community, and finally, describes the technical, epistemological and cultural context that motivated my research. Why write more than four hundred pages to justify a program of scientific research? For one very simple reason: no one in the contemporary scientific community thought that my research program had any chance of success. What is important in computer science and artificial intelligence is logic, formal syntax, statistics and biological models. Engineers generally view social sciences such as sociology or anthropology as nothing but auxiliary disciplines limited to cosmetic functions: for example, the analysis of usage or the experience of users. In the human sciences, the situation is even more difficult. All those who have tried to mathematize language, from Leibniz to Chomsky, to mention only the greatest, have failed, achieving only partial results. Worse yet, the greatest masters, those from whom I have learned so much, from the semiologist Umberto Eco[11] to the anthropologist Levi-Strauss,[12] have stated categorically that the mathematization of language and the human sciences is impracticable, impossible, utopian. The path I wanted to follow was forbidden not only by the habits of engineers and the major authorities in the human sciences but also by the nearly universal view that “meaning depends on context,”[13] unscrupulously confusing mathematization and quantification, denouncing on principle, in a “knee jerk” reaction, the “ethnocentric bias” of any universalist approach[14] and recalling the “failure” of Esperanto.[15] I have even heard some of the most agnostic speak of the curse of Babel. It is therefore not surprising that I want to make a strong case in defending the scientific nature of my undertaking: all explorers have returned empty-handed from this voyage toward mathematical language, if they returned at all.

The metalanguage: IEML

But one cannot go on forever announcing one’s departure on a voyage: one must set forth, navigate . . . and return. The second volume of my trilogy, La grammaire d’IEML,[16] contains the very technical account of my journey from algebra to language. In it, I explain how to construct sentences and texts in IEML, with many examples. But that 150-page book also contains 52 very dense pages of algorithms and mathematics that show in detail how the internal semantic networks of that artificial language can be calculated and translated automatically into natural languages. To connect a mathematical syntax to a semantics in natural languages, I had to, almost single-handed,[17] face storms on uncharted seas, to advance across the desert with no certainty that fertile land would be found beyond the horizon, to wander for twenty years in the convoluted labyrinth of meaning. But by gradually joining sign, being and thing in turn in the sense of the virtual and actual, I finally had my Ariadne’s thread, and I made a map of the labyrinth, a complicated map of the metalanguage, that “Northwest Passage”[18] where the waters of the exact sciences and the human sciences converged. I had set my course in a direction no one considered worthy of serious exploration since the crossing was thought impossible. But, against all expectations, my journey reached its goal. The IEML Grammar is the scientific proof of this. The mathematization of language is indeed possible, since here is a mathematical metalanguage. What is it exactly? IEML is an artificial language with calculable semantics that puts no limits on the possibilities for the expression of new meanings. Given a text in IEML, algorithms reconstitute the internal grammatical and semantic network of the text, translate that network into natural languages and calculate the semantic relationships between that text and the other texts in IEML. The metalanguage generates a huge group of symmetric transformations between semantic networks, which can be measured and navigated at will using algorithms. The IEML Grammar demonstrates the calculability of the semantic networks and presents the algorithmic workings of the metalanguage in detail. Used as a system of semantic metadata, IEML opens the way to new methods for analyzing large masses of data. It will be able to support new forms of translinguistic hypertextual communication in social media, and will make it possible for conversation networks to observe and perfect their own collective intelligence. For researchers in the human sciences, IEML will structure an open, universal encyclopedic library of multimedia data that reorganizes itself automatically around subjects and the interests of its users.

A new frontier: Algorithmic Intelligence

Having mapped the path I discovered in La grammaire d’IEML, I will now relate what I saw at the end of my journey, on the other side of the supposedly impassable territory: the new horizons of the mind that algorithmic intelligence illuminates. Because IEML is obviously not an end in itself. It is only the necessary means for the coming great digital civilization to enable the sun of human knowledge to shine more brightly. I am talking here about a future (but not so distant) state of intelligence, a state in which capacities for reflection, creation, communication, collaboration, learning, and analysis and synthesis of data will be infinitely more powerful and better distributed than they are today. With the concept of Algorithmic Intelligence, I have completed the risky work of prediction and cultural creation I undertook with Collective Intelligence twenty years ago. The contemporary algorithmic medium is already characterized by digitization of data, automated data processing in huge industrial computing centres, interactive mobile interfaces broadly distributed among the population and ubiquitous communication. We can make this the medium of a new type of knowledge—a new episteme[19]—by adding a system of semantic metadata based on IEML. The purpose of this paper is precisely to lay the philosophical and historical groundwork for this new type of knowledge.

Philosophical genealogy of algorithmic intelligence

The three ages of reflexive knowledge

Since my project here involves a reflexive collective intelligence, I would like to place the theme of reflexive knowledge in its historical and philosophical context. As a first approximation, reflexive knowledge may be defined as knowledge knowing itself. “All men by nature desire to know,” wrote Aristotle, and this knowledge implies knowledge of the self.[20] Human beings have no doubt been speculating about the forms and sources of their own knowledge since the dawn of consciousness. But the reflexivity of knowledge took a decisive step around the middle of the first millennium BCE,[21] during the period when the Buddha, Confucius, the Hebrew prophets, Socrates and Zoroaster (in alphabetical order) lived. These teachers involved the entire human race in their investigations: they reflected consciousness from a universal perspective. This first great type of systematic research on knowledge, whether philosophical or religious, almost always involved a divine ideal, or at least a certain “relation to Heaven.” Thus we may speak of a theosophical age of reflexive knowledge. I will examine the Aristotelian lineage of this theosophical consciousness, which culminated in the concept of the agent intellect. Starting in the sixteenth century in Europe—and spreading throughout the world with the rise of modernity—there was a second age of reflection on knowledge, which maintained the universal perspective of the previous period but abandoned the reference to Heaven and confined itself to human knowledge, with its recognized limits but also its rational ideal of perfectibility. This was the second age, the scientific age, of reflexive knowledge. Here, the investigation follows two intertwined paths: one path focusing on what makes knowledge possible, the other on what limits it. In both cases, knowledge must define its transcendental subject, that is, it must discover its own determinations. There are many signs in 2014 indicating that in the twenty-first century—around the point where half of humanity is connected to the Internet—we will experience a third stage of reflexive knowledge. This “version 3.0” will maintain the two previous versions’ ideals of universality and scientific perfectibility but will be based on the intensive use of technology to augment and reflect systematically our collective intelligence, and therefore our capacities for personal and social learning. This is the coming technological age of reflexive knowledge with its ideal of an algorithmic intelligence. The brief history of these three modalities—theosophical, scientific and technological—of reflexive knowledge can be read as a philosophical genealogy of algorithmic intelligence.

The theosophical age and its agent intellect

A few generations earlier, Socrates might have been a priest in the circle around the Pythia; he had taken the famous maxim “Know thyself” from the Temple of Apollo at Delphi. But in the fifth century BCE in Athens, Socrates extended the Delphic injunction in an unexpected way, introducing dialectical inquiry. He asked his contemporaries: What do you think? Are you consistent? Can you justify what you are saying about courage, justice or love? Could you repeat it seriously in front of a little group of intelligent or curious citizens? He thus opened the door to a new way of knowing one’s own knowledge, a rational expansion of consciousness of self. His main disciple, Plato, followed this path of rigorous questioning of the unthinking categorization of reality, and finally discovered the world of Ideas. Ideas for Plato are intellectual forms that, unlike the phenomena they categorize, do not belong to the world of Becoming. These intelligible forms are the original essences, archetypes beyond reality, which project into phenomenal time and space all those things that seem to us to be truly real because they are tangible, but that are actually only pale copies of the Ideas. We would say today that our experience is mainly determined by our way of categorizing it. Plato taught that humanity can only know itself as an intelligent species by going back to the world of Ideas and coming into contact with what explains and motivates its own knowledge. Aristotle, who was Plato’s student and Alexander the Great’s tutor, created a grand encyclopedic synthesis that would be used as a model for eighteen centuries in a multitude of cultures. In it, he integrates Plato’s discovery of Ideas with the sum of knowledge of his time. He places at the top of his hierarchical cosmos divine thought knowing itself. And in his Metaphysics,[22] he defines the divinity as “thought thinking itself.” This supreme self-reflexive thought was for him the “prime mover” that inspires the eternal movement of the cosmos. In De Anima,[23] his book on psychology and the theory of knowledge, he states that, under the effect of an agent intellect separate from the body, the passive intellect of the individual receives intelligible forms, a little like the way the senses receive sensory forms. In thinking these intelligible forms, the passive intellect becomes one with its objects and, in so doing, knows itself. Starting from the enigmatic propositions of Aristotle’s theology and psychology, a whole lineage of Peripatetic and Neo-Platonic philosophers—first “pagans,” then Muslims, Jews and Christians—developed the discipline of noetics, which speculates on the divine intelligence, its relation to human intelligence and the type of reflexivity characteristic of intelligence in general.[24] According to the masters of noetics, knowledge can be conceptually divided into three aspects that, in reality, are indissociable and complementary:

  • the intellect,or the knowing subject
  • the intelligence,or the operation of the subject
  • the intelligible,or what is known—or can be known—by the subject by virtue of its operation

From a theosophical perspective, everything that happens takes place in the unity of a self-reflexive divine thought, or (in the Indian tradition) in the consciousness of an omniscient Brahman or Buddha, open to infinity. In the Aristotelian tradition, Avicenna, Maimonides and Albert the Great considered that the identity of the intellect, the intelligence and the intelligible was achieved eternally in God, in the perfect reflexivity of thought thinking itself. In contrast, it was clear to our medieval theosophists that in the case of human beings, the three aspects of knowledge were neither complete nor identical. Indeed, since the passive intellect knows itself only through the intermediary of its objects, and these objects are constantly disappearing and being replaced by others, the reflexive knowledge of a finite human being can only be partial and transitory. Ultimately, human knowledge could know itself only if it simultaneously knew, completely and enduringly, all its objects. But that, obviously, is reserved only for the divinity. I should add that the “one beyond the one” of the neo-Platonist Plotinus and the transcendent deity of the Abrahamic traditions are beyond the reach of the human mind. That is why our theosophists imagined a series of mediations between transcendence and finitude. In the middle of that series, a metaphysical interface provides communication between the unimaginable and inaccessible deity and mortal humanity dispersed in time and space, whose living members can never know—or know themselves—other than partially. At this interface, we find the agent intellect, which is separate from matter in Aristotle’s psychology. The agent intellect is not limited—in the realm of time—to sending the intelligible categories that inform the human passive intellect; it also determines—in the realm of eternity—the maximum limit of what the human race can receive of the universal and perfectly reflexive knowledge of the divine. That is why, according to the medieval theosophists, the best a mortal intelligence can do to approach complete reflexive knowledge is to contemplate the operation in itself of the agent intellect that emanates from above and go back to the source through it. In accordance with this regulating ideal of reflexive knowledge, living humanity is structured hierarchically, because human beings are more or less turned toward the illumination of the agent intellect. At the top, prophets and theosophists receive a bright light from the agent intellect, while at the bottom, human beings turned toward coarse material appetites receive almost nothing. The influx of intellectual forms is gradually obscured as we go down the scale of degree of openness to the world above.

The scientific age and its transcendental subject

With the European Renaissance, the use of the printing press, the construction of new observation instruments, and the development of mathematics and experimental science heralded a new era. Reflection on knowledge took a critical turn with Descartes’s introduction of radical doubt and the scientific method, in accordance with the needs of educated Europe in the seventeenth century. God was still present in the Cartesian system, but He was only there, ultimately, to guarantee the validity of the efforts of human scientific thought: “God is not a deceiver.”[25] The fact remains that Cartesian philosophy rests on the self-reflexive edge, which has now moved from the divinity to the mortal human: “I think, therefore I am.”[26] In the second half of the seventeenth century, Spinoza and Leibniz received the critical scientific rationalism developed by Descartes, but they were dissatisfied with his dualism of thought (mind) and extension (matter). They therefore attempted, each in his own way, to constitute reflexive knowledge within the framework of coherent monism. For Spinoza, nature (identified with God) is a unique and infinite substance of which thought and extension are two necessary attributes among an infinity of attributes. This strict ontological monism is counterbalanced by a pluralism of expression, because the unique substance possesses an infinity of attributes, and each attribute, an infinity of modes. The summit of human freedom according to Spinoza is the intellectual love of God, that is, the most direct and intuitive possible knowledge of the necessity that moves the nature to which we belong. For Leibniz, the world is made up of monads, metaphysical entities that are closed but are capable of an inner perception in which the whole is reflected from their singular perspective. The consistency of this radical pluralism is ensured by the unique, infinite divine intelligence that has considered all possible worlds in order to create the best one, which corresponds to the most complex—or the richest—of the reciprocal reflections of the monads. As for human knowledge—which is necessarily finite—its perfection coincides with the clearest possible reflection of a totality that includes it but whose unity is thought only by the divine intelligence. After Leibniz and Spinoza, the eighteenth century saw the growth of scientific research, critical thought and the educational practices of the Enlightenment, in particular in France and the British Isles. The philosophy of the Enlightenment culminated with Kant, for whom the development of knowledge was now contained within the limits of human reason, without reference to the divinity, even to envelop or guarantee its reasoning. But the ideal of reflexivity and universality remained. The issue now was to acquire a “scientific” knowledge of human intelligence, which could not be done without the representation of knowledge to itself, without a model that would describe intelligence in terms of what is universal about it. This is the purpose of Kantian transcendental philosophy. Here, human intelligence, armed with its reason alone, now faces only the phenomenal world. Human intelligence and the phenomenal world presuppose each other. Intelligence is programmed to know sensory phenomena that are necessarily immersed in space and time. As for phenomena, their main dimensions (space, time, causality, etc.) correspond to ways of perceiving and understanding that are specific to human intelligence. These are forms of the transcendental subject and not intrinsic characteristics of reality. Since we are confined within our cognitive possibilities, it is impossible to know what things are “in themselves.” For Kant, the summit of reflexive human knowledge is in a critical awareness of the extension and the limits of our possibility of knowing. Descartes, Spinoza, Leibniz, the English and French Enlightenment, and Kant accomplished a great deal in two centuries, and paved the way for the modern philosophy of the nineteenth and twentieth centuries. A new form of reflexive knowledge grew, spread, and fragmented into the human sciences, which mushroomed with the end of the monopoly of theosophy. As this dispersion occurred, great philosophers attempted to grasp reflexive knowledge in its unity. The reflexive knowledge of the scientific era neither suppressed nor abolished reflexive knowledge of the theosophical type, but it opened up a new domain of legitimacy of knowledge, freed of the ideal of divine knowledge. This de jure separation did not prevent de facto unions, since there was no lack of religious scholars or scholarly believers. Modern scientists could be believers or non-believers. Their position in relation to the divinity was only a matter of motivation. Believers loved science because it revealed the glory of the divinity, and non-believers loved it because it explained the world without God. But neither of them used as arguments what now belonged only to their private convictions. In the human sciences, there were systematic explorations of the determinations of human existence. And since we are thinking beings, the determinations of our existence are also those of our thought. How do the technical, historical, economic, social and political conditions in which we live form, deform and set limits on our knowledge? What are the structures of our biology, our language, our symbolic systems, our communicative interactions, our psychology and our processes of subjectivation? Modern thought, with its scientific and critical ideal, constantly searches for the conditions and limits imposed on it, particularly those that are as yet unknown to it, that remain in the shadows of its consciousness. It seeks to discover what determines it “behind its back.” While the transcendental subject described by Kant in his Critique of Pure Reason fixed the image a great mind had of it in the late eighteenth century, modern philosophy explores a transcendental subject that is in the process of becoming, continually being re-examined and more precisely defined by the human sciences, a subject immersed in the vagaries of cultures and history, emerging from its unconscious determinations and the techno-symbolic mechanisms that drive it. I will now broadly outline the figure of the transcendental subject of the scientific era, a figure that re-examines and at the same time transforms the three complementary aspects of the agent intellect.

  • The Aristotelian intellect becomes living intelligence. This involves the effective cognitive activities of subjects, what is experienced spontaneously in time by living, mortal human beings.
  • The intelligence becomes scientific investigation. I use this term to designate all undertakings by which the living intelligence becomes scientifically intelligible, including the technical and symbolic tools, the methods and the disciplines used in those undertakings.
  • The intelligible becomes the intelligible intelligence, which is the image of the living intelligence that is produced through scientific and critical investigation.

An evolving transcendental subject emerges from this reflexive cycle in which the living intelligence contemplates its own image in the form of a scientifically intelligible intelligence. Scientific investigation here is the internal mirror of the transcendental subjectivity, the mediation through which the living intelligence observes itself. It is obviously impossible to confuse the living intelligence and its scientifically intelligible image, any more than one can confuse the map and the territory, or the experience and its description. Nor can one confuse the mirror (scientific investigation) with the being reflected in it (the living intelligence), nor with the image that appears in the mirror (the intelligible intelligence). These three aspects together form a dynamic unit that would collapse if one of them were eliminated. While the living intelligence would continue to exist without a mirror or scientific image, it would be very much diminished. It would have lost its capacity to reflect from a universal perspective. The creative paradox of the intellectual reflexivity of the scientific age may be formulated as follows. It is clear, first of all, that the living intelligence is truly transformed by scientific investigation, since the living intelligence that knows its image through a certain scientific investigation is not the same (does not have the same experience) as the one that does not know it, or that knows another image, the result of another scientific investigation. But it is just as clear, by definition, that the living intelligence reflects itself in the intelligible image presented to it through scientific knowledge. In other words, the living intelligence is equally dependent on the scientific and critical investigation that produces the intelligible image in which it is reflected. When we observe our physical appearance in a mirror, the image in the mirror in no way changes our physical appearance, only the mental representation we have of it. However, the living intelligence cannot discover its intelligible image without including the reflexive process itself in its experience, and without at the same time being changed. In short, a critical science that explores the limits and determinations of the knowing subject does not only reflect knowledge—it increases it. Thus the modern transcendental subject is—by its very nature—evolutionary, participating in a dynamic of growth. In line with this evolutionary view of the scientific age, which contrasts with the fixity of the previous age, the collectivity that possesses reflexive knowledge is no longer a theosophical hierarchy oriented toward the agent intellect but a republic of letters oriented toward the augmentation of human knowledge, a scientific community that is expanding demographically and is organized into academies, learned societies and universities. While the agent intellect looked out over a cosmos emanating from eternity, in analog resonance with the human microcosm, the transcendental subject explores a universe infinitely open to scientific investigation, technical mastery and political liberation.

The technological age and its algorithmic intelligence

Reflexive knowledge has, in fact, always been informed by some technology, since it cannot be exercised without symbolic tools and thus the media that support those tools. But the next age of reflexive knowledge can properly be called technological because the technical augmentation of cognition is explicitly at the centre of its project. Technology now enters the loop of reflexive consciousness as the agent of the acceleration of its own augmentation. This last point was no doubt glimpsed by a few pre–twentieth century philosophers, such as Condorcet in the eighteenth century, in his posthumous book of 1795, Sketch for a Historical Picture of the Progress of the Human Mind. But the truly technological dimension of reflexive knowledge really began to be thought about fully only in the twentieth century, with Pierre Teilhard de Chardin, Norbert Wiener and Marshall McLuhan, to whom we should also add the modest genius Douglas Engelbart. The regulating ideal of the reflexive knowledge of the theosophical age was the agent intellect, and that of the scientific-critical age was the transcendental subject. In continuity with the two preceding periods, the reflexive knowledge of the technological age will be organized around the ideal of algorithmic intelligence, which inherits from the agent intellect its universality or, in other words, its capacity to unify humanity’s reflexive knowledge. It also inherits its power to be reflected in finite intelligences. But, in contrast with the agent intellect, instead of descending from eternity, it emerges from the multitude of human actions immersed in space and time. Like the transcendental subject, algorithmic intelligence is rational, critical, scientific, purely human, evolutionary and always in a state of learning. But the vocation of the transcendental subject was to reflexively contain the human universe. However, the human universe no longer has a recognizable face. The “death of man” announced by Foucault[27] should be understood in the sense of the loss of figurability of the transcendental subject. The labyrinth of philosophies, methodologies, theories and data from the human sciences has become inextricably complicated. The transcendental subject has not only been dissolved in symbolic structures or anonymous complex systems, it is also fragmented in the broken mirror of the disciplines of the human sciences. It is obvious that the technical medium of a new figure of reflexive knowledge will be the Internet, and more generally, computer science and ubiquitous communication. But how can symbol-manipulating automata be used on a large scale not only to reunify our reflexive knowledge but also to increase the clarity, precision and breadth of the teeming diversity enveloped by our knowledge? The missing link is not only technical, but also scientific. We need a science that grasps the new possibilities offered by technology in order to give collective intelligence the means to reflect itself, thus inaugurating a new form of subjectivity. As the groundwork of this new science—which I call computational semantics—IEML makes use of the self-reflexive capacity of language without excluding any of its functions, whether they be narrative, logical, pragmatic or other. Computational semantics produces a scientific image of collective intelligence: a calculated intelligence that will be able to be explored both as a simulated world and as a distributed augmented reality in physical space. Scientific change will generate a phenomenological change,[28] since ubiquitous multimedia interaction with a holographic image of collective intelligence will reorganize the human sensorium. The last, but not the least, change: social change. The community that possessed the previous figure of reflexive knowledge was a scientific community that was still distinct from society as a whole. But in the new figure of knowledge, reflexive collective intelligence emerges from any human group. Like the previous figures—theosophical and scientific—of reflexive knowledge, algorithmic intelligence is organized in three interdependent aspects.

  • Reflexive collective intelligence represents the living intelligence, the intellect or soul of the great future digital civilization. It may be glimpsed by deciphering the signs of its approach in contemporary reality.
  • Computational semantics holds up a technical and scientific mirror to collective intelligence, which is reflected in it. Its purpose is to augment and reflect the living intelligence of the coming civilization.
  • Calculated intelligence, finally, is none other than the scientifically knowable image of the living intelligence of digital civilization. Computational semantics constructs, maintains and cultivates this image, which is that of an ecosystem of ideas coming out of the human activity in the algorithmic medium and can be explored in sensory-motor mode.

In short, in the emergent unity of algorithmic intelligence, computational semantics calculates the cognitive simulation that augments and reflects the collective intelligence of the coming civilization.

[1] Professor at the University of Ottawa

[2] And twenty-three years after L’idéographie dynamique (Paris: La Découverte, 1991).

[3] And before the WWW itself, which would become a public phenomenon only in 1994 with the development of the first browsers such as Mosaic. At the time when the book was being written, the Web still existed only in the mind of Tim Berners-Lee.

[4] Approximately 40% in 2014 and probably more than half in 2025.

[5] I obviously do not claim to be the only “visionary” on the subject in the early 1990s. The pioneering work of Douglas Engelbart and Ted Nelson and the predictions of Howard Rheingold, Joël de Rosnay and many others should be cited.

[6] See The basics of IEML (on line at: http://wp.me/P3bDiO-9V )

[7] Beyond logic and statistics.

[8] IEML is the acronym for Information Economy MetaLanguage. See La grammaire d’IEML (On line http://wp.me/P3bDiO-9V ) [9] The Semantic Sphere 1: Computation, Cognition and Information Economy (London: ISTE, 2011; New York: Wiley, 2011).

[10] More than four hundred reference books.

[11] Umberto Eco, The Search for the Perfect Language (Oxford: Blackwell, 1995).

[12] “But more madness than genius would be required for such an enterprise”: Claude Levi-Strauss, The Savage Mind (University of Chicago Press, 1966), p. 130.

[13] Which is obviously true, but which only defines the problem rather than forbidding the solution.

[14] But true universalism is all-inclusive, and our daily lives are structured according to a multitude of universal standards, from space-time coordinates to HTTP on the Web. I responded at length in The Semantic Sphere to the prejudices of extremist post-modernism against scientific universality.

[15] Which is still used by a large community. But the only thing that Esperanto and IEML have in common is the fact that they are artificial languages. They have neither the same form nor the same purpose, nor the same use, which invalidates criticisms of IEML based on the criticism of Esperanto.

[16] See IEML Grammar (On line http://wp.me/P3bDiO-9V ).

[17] But, fortunately, supported by the Canada Research Chairs program and by my wife, Darcia Labrosse.

[18] Michel Serres, Hermès V. Le passage du Nord-Ouest (Paris: Minuit, 1980).

[19] The concept of episteme, which is broader than the concept of paradigm, was developed in particular by Michel Foucault in The Order of Things (New York: Pantheon, 1970) and The Archaeology of Knowledge and the Discourse on Language (New York: Pantheon, 1972).

[20] At the beginning of Book A of his Metaphysics.

[21] This is the Axial Age identified by Karl Jaspers.

[22] Book Lambda, 9

[23] In particular in Book III.

[24] See, for example, Moses Maimonides, The Guide For the Perplexed, translated into English by Michael Friedländer (New York: Cosimo Classic, 2007) (original in Arabic from the twelfth century). – Averroes (Ibn Rushd), Long Commentary on the De Anima of Aristotle, translated with introduction and notes by Richard C. Taylor (New Haven: Yale University Press, 2009) (original in Arabic from the twelfth century). – Saint Thomas Aquinas: On the Unity of the Intellect Against the Averroists (original in Latin from the thirteenth century) – Herbert A. Davidson, Alfarabi, Avicenna, and Averroes, on Intellect. Their Cosmologies, Theories of the Active Intellect, and Theories of Human Intellect (New York, Oxford: Oxford University Press, 1992). – Henri Corbin, History of Islamic Philosophy, translated by Liadain and Philip Sherrard (London: Kegan Paul, 1993). – Henri Corbin, En Islam iranien: aspects spirituels et philosophiques, 2d ed. (Paris: Gallimard, 1978), 4 vol. – De Libera, Alain Métaphysique et noétique: Albert le Grand (Paris: Vrin, 2005).

[25] In Meditations on First Philosophy, “First Meditation.” [26] Discourse on the Method, “Part IV.”

[27] At the end of The Order of Things (New York: Pantheon Books, 1970). [28] See, for example, Stéphane Vial, L’être et l’écran (Paris: PUF, 2013).