IEML (the Information Economy Meta Language) has four main directions of research and development in 2019: in mathematics, data science, linguistics and software development. This blog entry reviews them successively.

1- A mathematical research program

I will give here a philosophical description of the structure of IEML, the purpose of the mathematical research to come being to give a formal description and to draw from this formalisation as much useful information as possible on the calculation of relationships, distances, proximities, similarities, analogies, classes and others… as well as on the complexity of these calculations. I had already produced a formalization document in 2015 with the help of Andrew Roczniak, PhD, but this document is now (2019) overtaken by the evolution of the IEML language. The Brazilian physicist Wilson Simeoni Junior has volunteered to lead this research sub-program.

IEML Topos

The “topos” is a structure that was identified by the great mathematician Alexander Grothendieck, who “is considered as the re-founder of algebraic geometry and, as such, as one of the greatest mathematicians of the 20th century” (see Wikipedia).

Without going into technical details, a topos is a bi-directional relationship between, on the one hand, an algebraic structure, usually a “category” (intuitively a group of transformations of transformation groups) and, on the other hand, a spatial structure, which is geometric or topological. 

In IEML, thanks to a normalization of the notation, each expression of the language corresponds to an algebraic variable and only one. Symmetrically, each algebraic variable corresponds to one linguistic expression and only one. 

Topologically, each variable in IEML algebra (i.e. each expression of the language) corresponds to a “point”. But these points are arranged in different nested recursive complexity scales: primitive variables, morphemes of different layers, characters, words, sentences, super-phrases and texts. However, from the level of the morpheme, the internal structure of each point – which comes from the function(s) that generated the point – automatically determines all the semantic relationships that this point has with the other points, and these relationships are modelled as connections. There are obviously a large number of connection types, some very general (is contained in, has an intersection with, has an analogy with…) others more precise (is an instrument of, contradicts X, is logically compatible with, etc.).

The topos that match all the expressions of the IEML language with all the semantic relationships between its expressions is called “The Semantic Sphere”.

Algebraic structure of IEML

In the case of IEML, the algebraic structure is reduced to 

  • 1. Six primitive variables 
  • 2. A non-commutative multiplication with three variables (substance, attribute and mode). The IEML multiplication is isomorphic to the triplet ” departure vertex, arrival vertex, edge ” which is used to describe the graphs.
  • 3. A commutative addition that creates a set of objects.

This algebraic structure is used to construct the following functions and levels of variables…

1. Functions using primitive variables, called “morpheme paradigms”, have as inputs morphemes at layer n and as outputs morphemes at layer n+1. Morpheme paradigms include additions, multiplications, constants and variables and are visually presented in the form of tables in which rows and columns correspond to certain constants.

2. “Character paradigms” are complex additive functions that take morphemes as inputs and characters as outputs. Character paradigms include a group of constant morphemes and several groups of variables. A character is composed of 1 to 5 morphemes arranged in IEML alphabetical order. (Characters may not include more than five morphemes for cognitive management reasons).

3. IEML characters are assembled into words (a substance character, an attribute character, a mode character) by means of a multiplicative function called a “word paradigm”. A word paradigm intersects a series of characters in substance and a series of characters in attribute. The modes are chosen from predefined auxiliary character paradigms, depending on whether the word is a noun, a verb or an auxiliary. Words express subjects, keywords or hashtags. A word can be composed of only one character.

4. Sentence building functions assemble words by means of multiplication and addition, with the necessary constraints to obtain grammatical trees. Mode words describe the grammatical/semantic relationships between substance words (roots) and attribute words (leaves). Sentences express facts, proposals or events; they can take on different pragmatic and logical values.

5. Super-sentences are generated by means of multiplication and addition of sentences, with constraints to obtain grammatical trees. Mode sentences express relationships between substance sentences and attribute sentences. Super-sentences express hypotheses, theories or narratives.

6. A USL (Uniform Semantic Locator) or IEML text is an addition (a set) of words, sentences and super-sentences. 

Topological structure of IEML: a semantic rhizome


The philosophical notion of rhizome (a term borrowed from botany) was developed on a philosophical level by Deleuze and Guattari in the preface to Mille Plateaux (Minuit 1980). In this Deleuzo-Guattarian lineage, by rhizome I mean here a complex graph whose points or “vertices” are organized into several levels of complexity (see the algebraic structure) and whose connections intersect several regular structures such as series, tree, matrix and clique. In particular, it should be noted that some structures of the IEML rhizome combine hierarchical or genealogical relationships (in trees) with transversal or horizontal relationships between “leaves” at the same level, which therefore do not respect the “hierarchical ladder”. 


We can distinguish the abstract, or virtual, rhizomatic grid drawn by the grammar of the language (the sphere to be dug) and the actualisation of points and relationships by the users of the language (the dug sphere of chambers and galleries).  Characters, words, sentences, etc. are all chambers in the centre of a star of paths, and the generating functions establish galleries of “rhizomatic” relationships between them, as many paths for exploring the chambers and their contents. It is therefore the users, by creating their lexicons and using them to index their data, communicate and present themselves, who shape and grow the rhizome…

Depending on whether circuits are more or less used, on the quantity of data or on the strength of interactions, the rhizome undergoes – in addition to its topological transformations – various types of quantitative or metric transformations. 

* The point to remember is that IEML is a language with calculable semantics because it is also an algebra (in the broad sense) and a complex topological space. 

* In the long term, IEML will be able to serve as a semantic coordinate system for the information world at large.

2 A research program in data science

The person in charge of the data science research sub-program is the software engineer (Eng. ENSIMAG, France) Louis van Beurden, who holds also a master’s degree in data science and machine translation from the University of Montréal, Canada. Louis is planning to complete a PhD in computer science in order to test the hypothesis that, from a data science perspective, a semantic metadata system in IEML is more efficient than a semantic metadata system in natural language and phonetic writing. This doctoral research will make it possible to implement phases A and B of the program below and to carry out our first experiment.

Background information

The basic cycle in data science can be schematized according to the following loop:

  • 1. selection of raw data,
  • 2. pre-processing, i.e. cleaning data and metadata imposition (cataloguing and categorization) to facilitate the exploitation of the results by human users,
  • 3. statistical processing,
  • 4. visual and interactive presentation of results,
  • 5. exploitation of the results by human users (interpretation, storytelling) and feedback on steps 1, 2, 3

Biases or poor quality of results may have several causes, but often come from poor pre-treatment. According to the old computer adage “garbage in, garbage out“, it is the professional responsibility of the data-scientists to ensure the quality of the input data and therefore not to neglect the pre-processing phase where this data is organized using metadata.

Two types of metadata can be distinguished: 1) semantic metadata, which describes the content of documents or datasets, and 2) ordinary metadata, which describes authors, creation dates, file types, etc. Let us call “semantic pre-processing” the imposition of semantic metadata on data.


Since IEML is a univocal language and the semantic relationships between morphemes, words, sentences, etc. are mathematically computable, we assume that a semantic metadata system in IEML is more efficient than a semantic metadata system in natural language and phonetic writing. Of course, the efficiency in question is related to a particular task: search, data analysis, knowledge extraction from data, machine learning, etc.

In other words, compared to a “tokenization” of semantic metadata in phonetic writing noting a natural language, a “tokenization” of semantic metadata in IEML would ensure better processing, better presentation of results to the user and better exploitation of results. In addition, semantic metadata in IEML would allow datasets that use different languages, classification systems or ontologies to be de-compartmentalized, merged and compared.

Design of the first experience

The ideal way to do an experiment is to consider a multi-variable system and transform only one of the system variables, all other things being equal. In our case, it is only the semantic metadata system that must vary. This will make it easy to compare the system’s performance with one (phonetic tokens) or the other (semantic tokens) of the semantic metadata systems.

  • – The dataset of our first experience encompasses all the articles of the Sens Public scientific journal.
  • – Our ordinary metadata are the author, publication date, etc.
  • – Our semantic metadata describe the content of articles.
  •     – In phonetic tokens, using RAMEAU categories, keywords and summaries,
  •     – In IEML tokens by translating phonetic tokens.
  • – Our processes are “big data” algorithms traditionally used in natural language processing 
  •     – An algorithm for calculating the co-occurrences of keywords.
  •     – A TF-IDF (Term Frequency / Inverse Document Frequency) algorithm that works from a word / document matrix.
  •     – A clustering algorithm based on “word embeddings” of keywords in articles (documents are represented by vectors, in a space with as many dimensions as words).
  • – A user interface will offer a certain way to access the database. This interface will be obviously adapted to the user’s task (which remains to be chosen, but could be of the “data analytics” type).
  • Result 1 corresponds to the execution of the “machine task”, i.e. the establishment of a connection network on the articles (relationships, proximities, groupings, etc.). We’ll have to compare….
  •     – result 1.1 based on the use of phonetic tokens with 
  •     – result 1.2 based on the use of IEML tokens.
  • Result 2 corresponds to the execution of the selected user-task (data analytics, navigation, search, etc.). We’ll have to compare….
  •     – result 2.1, based on the use of phonetic tokens, with 
  •     – result 2.2, based on the use of IEML tokens.

Step A: First indexing of a database in IEML

Reminder: the data are the articles of the scientific journal, the semantic metadata are the categories, keywords and summaries of the articles. From the categories, keywords and article summaries, a glossary of the knowledge area covered by the journal is created, or a sub-domain if it turns out that the task is too difficult. It should be noted that in 2019 we do not yet have the software tools to create IEML sentences and super-phrases that allow us to express facts, proposals, theories, narratives, hypotheses, etc. Phrases and super-phrases, perhaps accessible in a year or two, will therefore have to wait for a later phase of the research.

The creation of the glossary will be the work of a project community, linked to the editors of Sens-Public magazine and the Canada Research Chair in Digital Writing (led by Prof. Marcello Vitali-Rosati) at the Université de Montréal (Digital Humanities). Pierre Lévy will accompany this community and help it to identify the constants and variables of its lexicon. One of the auxiliary goals of the research is to verify whether motivated communities can appropriate IEML to categorize their data. Once we are satisfied with the IEML indexing of the article database, we will proceed to the next step.

Step B: First experimental test

  • 1. The test is determined to measure the difference between results based on phonetic tokens and results based on IEML tokens. 
  • 2. All data processing operations are carried out on the data.
  • 3. The results (machine tasks and user tasks) are compared with both types of tokens.

The experiment can eventually be repeated iteratively with minor modifications until satisfactory results are achieved.

If the hypothesis is confirmed, we proceed to the next step

Step C: Towards an automation of semantic pre-processing in IEML.

If the superior efficiency of IEML tokens for semantic metadata is demonstrated, then there will be a strong interest in maximizing the automation of IEML semantic pre-processing

The algorithms used in our experiment are themselves powerful tools for data pre-processing, they can be used, according to methods to be developed, to partially automate semantic indexing in IEML. The “word embeddings” will make it possible to study how IEML words are correlated with the natural language lexical statistics of the articles and to detect anomalies. For example, we will check if similar USLs (a USL is an IEML text) point to very different texts or if very different texts have similar USLs. 

Finally, methods will be developed to use deep learning algorithms to automatically index datasets in IEML.

Step D: Research and development perspective in Semantic Machine Learning

If step C provides the expected results, i.e. methods using AI to automate the indexing of data in IEML, then big data indexed in IEML will be available.  As progress will be made, semantic metadata may become increasingly similar to textual data (summary of sections, paragraphs, sentences, etc.) until translation into IEML is achieved, which remains a distant objective.

The data indexed in IEML could then be used to train artificial intelligence algorithms. The hypothesis that machines learn more easily when data is categorized in IEML could easily be validated by experiments of the same type as described above, by comparing the results obtained from training data indexed in IEML and the results obtained from the same data indexed in natural languages.

This last step paves the way for a better integration of statistical AI and symbolic AI (based on facts and rules, which can be expressed in IEML).

3 A research program in linguistics, humanities and social sciences


The semiotic and linguistic development program has two interdependent components:

1. The development of the IEML metalanguage

2. The development of translation systems and bridges between IEML and other sign systems, in particular… 

  •     – natural languages,
  •     – logical formalisms,
  •     – pragmatic “language games” and games in general,
  •     – iconic languages,
  •     – artistic languages, etc.

This research and development agenda, particularly in its linguistic dimension, is important for the digital humanities. Indeed, IEML can serve as a system of semantic coordinates of the cultural universe, thus allowing the humanities to cross a threshold of scientific maturity that would bring their epistemological status closer to that of the natural sciences. Using IEML to index data and to formulate assumptions would result in….

  • (1) a de-silo of databases used by researchers in the social sciences and humanities, which would allow for the sharing and comparison of categorization systems and interpretive assumptions;
  • (2) an improved analysis of data.
  • (3) The ultimate perspective, set out in the article “The Role of the Digital Humanities in the New Political Space” ( in French), is to aim for a reflective collective intelligence of the social sciences and humanities research community. 

But IEML’s research program in the perspective of the digital humanities – as well as its research program in data science – requires a living and dynamic semiotic and linguistic development program, some aspects of which I will outline here.

IEML and the Meaning-Text Theory

IEML’s linguistic research program is very much based on the Meaning-Text theory developed by Igor Melchuk and his school. “The main principle of this theory is to develop formal and descriptive representations of natural languages that can serve as a reliable and convenient basis for the construction of Meaning-Text models, descriptions that can be adapted to all languages, and therefore universal. ”(Excerpt translated from the Wikipedia article on Igor Melchuk). Dictionaries developed by linguists in this field connect words according to universal “lexical functions” identified through the analysis of many languages. These lexical functions have been formally transposed into the very structure of IEML (See the IEML Glossary Creation Guide) so that the IEML dictionary can be organized by the same tools (e.g. Spiderlex) as those of the Meaning-Text Theory research network. Conversely, IEML could be used as a pivot language – or concept description language – *between* the natural language dictionaries developed by the network of researchers skilled in Meaning-Text theory.

Construction of specialized lexicons in the humanities and social sciences

A significant part of the IEML lexicon will be produced by communities having decided to use IEML to mark out their particular areas of knowledge, competence or interaction. Our research in specialized lexicon construction aims to develop the best methods to help expert communities produce IEML lexicons. One of the approaches consists in identifying the “conceptual skeleton” of a domain, namely its main constants in terms of character paradigms and word paradigms. 

The first experimentation of this type of collaborative construction of specialized lexicons by experts will be conducted by Pierre Lévy in collaboration with the editorial team of the Sens Public scientific journal and the Canada Research Chair in Digital Textualities at the University of Montréal (led by Prof. Marcello Vitali-Rosati). Based on a determination of their economic and social importance, other specialized glossaries can be constructed, for example on the theme of professional skills, e-learning resources, public health prevention, etc.

Ultimately, the “digital humanities” branch of IEML will need to collaboratively develop a conceptual lexicon of the humanities to be used for the indexation of books and articles, but also chapters, sections and comments in documents. The same glossary should also facilitate data navigation and analysis. There is a whole program of development in digital library science here. I would particularly like to focus on the human sciences because the natural sciences have already developed a formal vocabulary that is already consensual.

Construction of logical, pragmatic and narrative character-tools

When we’ll have a sentence and super-phrase editor, it is planned to establish a correspondence between IEML – on the one hand – and propositional calculus and first order logics – on the other hand –. This will be done by specifying special character-tools to implement logical functions. Particular attention will be paid to formalizing the definition of rules and the declaration that “facts” are true in IEML. It should be noted in passing that, in IEML, grammatical expressions represent classes, sets or categories, but that logical individuals (proper names, numbers, etc.) or instances of classes are represented by “literals” expressed in ordinary characters (phonetic alphabets, Chinese characters, Arabic numbers, URLs, etc.).

In anticipation of practical use in communication, games, commerce, law (smart contracts), chatbots, robots, the Internet of Things, etc., we will develop a range of character-tools with illocutionary force such as “I offer”, “I buy”, “I quote”, “I give an instruction”, etc.

Finally, we will making it easier for authors of super-sentences by developing a range of character-tools implementing “narrative functions”.

4 A software development program

A software environment for the development and public use of the IEML language

Logically, the first multi-user IEML application will be dedicated to the development of the language itself. This application is composed of the following three web modules.

  • 1. A morpheme editor that also allows you to navigate in the morphemes database, or “dictionary”.
  • 2. A character and word editor that also allows navigation in the “lexicon”.
  • 3. A navigation and reading tool in the IEML library as a whole, or “IEML database” that brings together the dictionary and lexicon, with translations, synonyms and comments in French and English for the moment.

The IEML database is a “Git” database and is currently hosted by GitHub. Indeed, a Git database makes it possible to record successive versions of the language, as well as to monitor and model its growth. It also allows large-scale collaboration among teams capable of developing specific branches of the lexicon independently and then integrating them into the main branch after discussion, as is done in the collaborative development of large software projects. As soon as a sub-lexicon is integrated into the main branch of the Git database, it becomes a “common” usable by everyone (according to the latest General Public License version.

Morpheme and word editors are actually “Git clients” that feed the IEML database. A first version of this collaborative read-write environment should be available in the fall of 2019 and then tested by real users: the editors of the Scientific Journal “Sens Public” as well as other participants in the University of Montréal’s IEML seminar.

The following versions of the IEML read/write environment should allow the editing of sentences and texts as well as literals that are logical individuals not translated into IEML, such as proper names, numbers, URLs, etc.

A social medium for collaborative knowledge management

A large number of applications using IEML can be considered, both commercial and non-commercial. Among all these applications, one of them seems to be particularly aligned with the public interest: a social medium dedicated to collaborative knowledge and skills management. This new “place of knowledge” could allow the online convergence of the missions of… 

  • – museums and libraries, 
  • – schools and universities, 
  • – companies and administrations (with regard to their knowledge creation and management dimension), 
  • – smart cities, employment agencies, civil society networks, NGO, associations, etc.

According to its general philosophy, such a social medium should…

  • – be supported by an intrinsically distributed platform, 
  • – have the simplicity – or the economy of means – of Twitter,
  • – ensure the sovereignty of users over their data,
  • – promote collaborative processes.

The main functions performed by this social medium would be:

  • – data curation (reference and categorization of web pages, edition of resource collections), 
  • – teaching offers and learning demands,
  • – offers and demands for skills, or employment market.

IEML would serve as a common language for

  • – data categorization, 
  • – description of the knowledge and skills, 
  • – the expression of acts within the social medium (supply, demand, consent, publish, etc.)
  • – addressing users through their knowledge and skills.

Three levels of meaning would thus be formalized in this medium.

  • (1) The linguistic level in IEML  – including lexical and narrative functions – formalizes what is spoken about (lexicon) and what is said (sentences and super-phrases).
  • – (2) The logical – or referential – level adds to the linguistic level… 
  •     – logical functions (first order logic and propositional logic) expressed in IEML using logical character-tools,
  •     – the ability of pointing to references (literals, document URLs, datasets, etc.),
  •     – the means to express facts and rules in IEML and thus to feed inference engines.
  • – (3) The pragmatic level adds illocutionary functions and users to the linguistic and logical levels.
  •     – Illocutionary functions (thanks to pragmatic character-tools) allow the expression of conventional acts and rules (such as “game” rules). 
  •     – The pragmatic level obviously requires the consideration of players or users, as well as user groups.
  •     – It should be noted that there is no formal difference between logical inference and pragmatic inference but only a difference in use, one aiming at the truth of propositions according to referred states of things, the other calculating the rights, obligations, gains, etc. of users according to their actions and the rules of the games they play.

The semantic profiles of users and datasets will be arranged according to the three levels that have just been explained. The “place of knowledge” could be enhanced by the use of tokens or crypto-currencies to reward participation in collective intelligence. If successful, this type of medium could be generalized to other areas such as health, democratic governance, trade, etc.

Ramon Lull

Le Livre Blanc d’IEML, le métalangage de l’économie de l’information. 2019.
RESUMÉ. IEML est une langue à la sémantique calculable inventée par Pierre Lévy. Le “Livre blanc” (version Beta et non finie) explique les grands principes, la grammaire et les premières applications d’IEML. (une centaine de pages)

Etre et Mémoire dans la revue Sens Public 2019
RÉSUMÉ Le premier enjeu de cet article est de replacer l’objet des sciences humaines (la culture et la signification symbolique) dans la continuité des objets des sciences de la nature. Je fais l’hypothèse que le sens n’apparaît pas brusquement avec l’humanité mais que différentes couches de codage et de mémoire (quantique, atomique, génétique, nerveuse et symbolique) s’empilent et se complexifient progressivement, la strate symbolique n’étant que la dernière en date des « machines d’écriture ». Le second enjeu du texte est de définir la spécificité et l’unité de la couche symbolique, et donc le champ des sciences humaines. Par opposition à une certaine tradition logocentrique, je montre que le symbolisme – s’il comprend évidemment le langage – englobe aussi des sémiotiques (comme la cuisine ou la musique) où la coupure signifiant/signifié n’est pas aussi pertinente que pour les langues. Le troisième enjeu de cet essai est de montrer que les formes culturelles et les puissances interprétatives de l’humanité évoluent avec ses machines d’écriture. L’émergence du numérique, en particulier, laisse entrevoir un raffinement des sciences humaines allant jusqu’au calcul de la complexité sémantique. Cet essai de redéfinition des sciences humaines dans la continuité des sciences de la nature suppose une ontologie – ou une méta-ontologie, selon l’expression de Marcello Vitali-Rosati – pour qui les notions d’écriture et de mémoire sont centrales et qui, en rupture avec la critique kantienne, accepte la pleine réalité de la spatialité et de la temporalité naturelle.

Le rôle des humanités numériques dans le nouvel espace politique dans la revue Sens Public, 2019
RESUMÉ. Alors que plus de 50% de la population mondiale est connectée à l’Internet, les grandes plateformes, et particulièrement Facebook, ont acquis un énorme pouvoir politique. Cette nouvelle situation nous oblige a repenser le projet d’émancipation des lumières. Je propose dans cet article que les chercheurs en sciences humaines et sociales relèvent ce défi en adoptant et en diffusant de nouvelles normes d’intelligence collective réflexive. Les communs de la connaissance, la science ouverte et la souveraineté des individus sur les données qu’ils produisent font l’unanimité. Mais ces principes incontournables sont encore insuffisants. La puissance de calcul et de communication disponible, combinée à l’utilisation d’IEML (une langue à la sémantique calculable), nous permettent d’envisager une mise en transparence des opérations de création de connaissance, de sens et d’autorité. Je présente ici les grandes orientations stratégiques permettant d’atteindre ces objectifs. Une révolution épistémologique des sciences humaines est à portée de main, et avec elle une nouvelle étape dans l’évolution de la pensée critique. (une cinquantaine de pages)

La Pyramide algorithmique dans la revue Sens Public 2017
RESUMÉ. Le medium algorithmique est une infrastructure de communication qui augmente les pouvoirs des médias antérieurs en y ajoutant la mécanisation des opérations symboliques. Son émergence au milieu du vingtième siècle résulte d’une longue histoire scientifique et technique que je résume au début de l’article. Je rappelle ensuite les grandes étapes de son développement (ordinateurs centraux, internet et PC, Web social, Cloud augmenté par l’intelligence artificielle et la chaîne de blocs) ainsi que leurs conséquences sociocognitives. J’évoque pour finir les développements futurs de ce médium dans la perspective d’une intelligence collective réflexive basée sur une nouvelle forme de calcul sémantique.

Les opérateurs élémentaires de la réflexionCahiers Sens public, 2018/1 (n° 21-22), p. 75-102. La philosophie qui a inspiré les “primitives” d’IEML.
RÉSUMÉ. Cet article tente de réduire au minimum les concepts fondamentaux nécessaires à la réflexion sur le sens. Deux concepts complémentaires, la virtualité et l’actualité, rendent compte des dualités de l’action et de la grande opposition métaphysique entre transcendance et immanence. L’actuel possède une adresse spatio-temporelle, il est situé dans le temps séquentiel et dans l’espace physique tridimensionnel tandis qu’on ne peut assigner d’adresse spatio-temporelle précise à l’abstraction du virtuel. Le triangle sémiotique rend compte des triades de la représentation. Le signe (1) indique (2) une chose, un objet ou un référent quelconque auprès (3) d’un être ou interprétant. Il n’y a de signe que « de » quelque chose et « pour » quelqu’un. Enfin, il faut pouvoir considérer explicitement une absence, y compris un vide de connaissance, pour poser des questions et réfléchir. Les six opérateurs élémentaires de la réflexion (virtuel, actuel, signe, être, chose et vide) fonctionnent de manière interdépendante et traversent tous les champs des sciences humaines et sociale : on étudie particulièrement dans cet article leur pertinence en sémiotique, épistémologie, cosmologie, religion, politique et économie.


Image: Kuo Cheng Liao (found here).

Je voudrais répondre dans cette petite entrée de blog à quelques questions qui m’ont été posées par des amis Turcs (du site Çeviri Konusmalar) au sujet de l’intelligence artificielle et de l’autonomie des machines. Voir ici sur Twitter…

Un des rôles de la philosophie est de catégoriser l’expérience humaine de façon à réduire le plus possible l’illusion, ou si l’on préfère à trouver les concepts qui vont nous permettre de comprendre notre situation et de mieux guider notre action. Cela amène souvent les philosophes à contredire l’opinion courante. Aujourd’hui cette opinion est propagée par le journalisme et la fiction. Aussi bien les journalistes que les auteurs de roman ou de série TV présentent les robots ou l’intelligence artificielle comme capable d’autonomie et de conscience, que ce soit dès maintenant ou dans un futur proche. Cette représentation est à mon avis fausse, mais elle fonctionne très bien parce qu’elle joue…

  • ou bien sur la peur d’être éliminé ou asservi par des machines (sensationnalisme ou récit dystopique),
  • ou bien sur l’espoir que l’intelligence artificielle va nous aider magiquement à résoudre tous nos problèmes ou – pire – qu’elle représenterait une espèce plus avancée que l’homme (dans le cas de certaines publicités ou d’utopies naïves).

Dans les deux cas, espoir ou peur, le ressort principal est la passion, l’émotion, et non pas une compréhension exacte de ce que c’est que le traitement automatique de l’information et du rôle qu’il joue dans l’intelligence humaine.

Afin de recadrer cette question de l’autonomie des machines, je voudrais répondre ici le plus simplement possible à trois questions:

  1. Qu’est-ce que l’intelligence humaine?
  2. Qu’est-ce que l’informatique, ou les machines à traiter l’information?
  3. Est-ce que les machines peuvent devenir autonomes?

Qu’est-ce que c’est que l’Intelligence humaine?

D’abord il faut reconnaître que les humains sont des animaux et que les animaux ont déjà des capacité de mémoire, de représentation interne des situations, de résolution de problèmes, d’apprentissage, etc. Les animaux sont des êtres sensibles, qui ressentent attraction et répulsion, plaisir et douleur, voire empathie. Les plus plus intelligents d’entre eux ont la capacité de transmettre certaines connaissances acquises dans l’expérience à leur progéniture, d’utiliser des outils, etc. Ensuite, l’intelligence animale se manifeste de manière particulièrement frappante sur un plan collectif ou social et, pour ce qui nous intéresse, notamment chez les primates (les grands singes), dont nous faisons partie. Les primates ont des structures sociales avec des rôles sociaux fort différenciés et des stratégies collectives élaborées pour se défendre, se nourrir, contrôler leur territoire, etc. Nous partageons bien sûr toute cette intelligence animale. Mais nous avons en plus la manipulation symbolique.

Ce qui différencie l’intelligence humaine de l’intelligence animale c’est d’abord et avant tout l’usage du langage et des systèmes symboliques. Un système symbolique c’est un moyen de communication et de pensée dont les éléments – les symboles – ont deux aspects: un aspect sensible (visible, audible) et un aspect invisible, abstrait, la catégorie générale. Et le rapport entre le signifiant sensible – le son – et le signifié intelligible – le sens – est conventionnel, décidé par la société. Il n’y a aucune autre raison que la convention et l’usage pour que le concept de raison, par exemple, se représente par les deux syllabes et zon en français, et la preuve c’est que ça se dit autrement dans d’autres langues. Tous les animaux communiquent mais seuls les êtres humains parlent, posent des questions, reconnaissent leur ignorance, dialoguent et surtout racontent des histoires. L’usage du langage donne aux humains non pas la conscience (que les autres animaux ont déjà), mais la conscience réflexive. La capacité de réfléchir sur les concepts nous est donnée par la manipulation des symboles.

Avec cette capacité de manipulation symbolique et cette réflexivité viennent deux caractéristiques spéciales de l’humanité: les systèmes techniques et les institutions sociales, tous deux d’une grande complexité et en constante évolution historique.

Une énorme partie de l’intelligence humaine est réifiée dans l’environnement technique et vécue dans des institutions sociales (rituels, politique, droit, religion, morale, etc.). La partie individuelle de notre intelligence est marginale, mais essentielle, c’est elle qui nous permet d’innover, de progresser et d’améliorer notre condition.

Qu’est-ce que l’informatique, ou le traitement automatique de l’information?

L’intelligence artificielle est une expression de type « marketing » pour designer en fait la zone la plus avancée et toujours en mouvement des techniques de traitement de l’information.

Quand je dis que l’intelligence humaine a toujours été artificielle, je ne veux pas dire que les humains sont des robots ou des machines, je veux dire que les humains ont toujours utilisé des procédés techniques pour augmenter leur intelligence, qu’il s’agisse de l’intelligence personnelle ou collective. L’écriture nous a donné le moyen d’étendre notre mémoire individuelle et nos capacités critiques. Aujourd’hui l’Internet nous permet un accès rapide à une quantité d’information que nos ancêtres n’auraient jamais pu imaginer. Mais ce n’est pas seulement une question de mémoire, nous avons aussi des capacités de calcul, de simulation de systèmes complexes, d’analyse automatique des données, voire de raisonnement automatique qui amplifient les capacités cognitives “purement biologiques” des premiers homo sapiens. Nous avons le même cerveau que les hommes préhistoriques, avec la même capacité de manipuler les symboles et de raconter des histoires, mais nous avons en plus un énorme appareillage d’enregistrement, de communication et de traitement des symboles qu’ils n’avaient pas et qui se branche sur la partie purement biologique de notre intelligence.

L’informatique, le traitement automatique des données, avec sa pointe avancée et mouvante qu’on appelle l’intelligence artificielle, est apparue dans la seconde moitié du 20e siècle, mais elle poursuit un effort multi-séculaire d’augmentation cognitive qui a commencé avec l’écriture, s’est poursuivi avec le perfectionnement des systèmes de codage de la connaissance, la notation des nombres par position et le 0, l’imprimerie et les médias électriques…

La partie névralgique du nouvel appareillage de traitement automatique des symboles se trouve aujourd’hui dans d’énormes centres de calculs qu’on appelle le “cloud” et dont nos ordinateurs et smartphones ne sont que des terminaux. Mais dans ce réseau de machines, le traitement automatique des données se fait uniquement sur la forme sensible des symboles, sur le signifiant ramené à des zeros et des uns. Les ordinateurs n’ont pas accès au signifié, au sens.

Puisqu’on m’interroge sur le machine learning, oui, parmi toutes les techniques de calcul utilisées aujourd’hui par les ingénieurs en informatique, le machine learning, et le deep learning qui en est un cas particulier, sont en plein développement depuis une dizaine d’années. Mais il faut se garder d’attribuer à l’apprentissage automatique plus qu’il ne peut donner. Il s’agit essentiellement d’algorithmes de traitement statistique auxquels on soumet en entrée d’énormes masses de données et qui produisent en sortie des modèles de reconnaissance de formes ou d’action qui sont “appris” des données. Or non seulement l’apprentissage machine dépend des algorithmes qui sont programmés et continuellement débogués par des humains, mais en plus ses résultats en sortie dépendent des masses de données qui leur sont fournies en entrée. Or ce sont encore des humains qui choisissent les données, les filtrent, les classent, les catégorisent, les organisent, les interprètent, etc. Aussi bien les approches logiques que les approches statistiques de l’intelligence artificielle condensent dans des machines logicielles et matérielles des connaissances et des finalités humaines. Leur autonomie, si autonomie il y a, ne peut être que locale et momentanée. A moyen et long terme, les machines ne peuvent évoluer qu’avec nous et vice versa: nous ne pouvons évoluer qu’avec elles.

La question de l’autonomie des machines

Le traitement automatique des données prolonge l’ensemble du système technique contemporain et il baigne dedans. Il est totalement dépendant de la production d’énergie, de la distribution d’électricité, de la production des matériaux, etc. On ne peut absolument pas imaginer le système technique contemporain sans l’informatique mais pas non plus l’informatique sans toute cette infrastructure technique.

De la même manière, le système technique s’effondrerait rapidement si les humains disparaissaient. Notre environnement technique est conçu, construit, utilisé, entretenu, réparé, interprété par des humains: il n’a aucune autonomie d’aucune sorte.

la technique nous *apparaît* autonome parce que nous projetons sur elle les effets émergents des interactions sociales et des inerties socio-techniques que nous ne pouvons pas contrôler à l’échelle individuelle. Nous avons tendance à réifier les effets de nombreuses décisions et actions humaines agrégées dans les machines et à prêter aux machines une volonté propre. Mais c’est une illusion. Une illusion qui nous décharge de nos responsabilités personnelles et collectives: “c’est la faute de la machine”.

Qu’on utilise une interface pseudo-humaine ou des robots androïdes autant qu’on veut, mais c’est un artifice, un décor. Le robot ou la machine est toujours susceptible d’être éteint ou débranché, quant à son logiciel dans le cloud, il doit sans cesse être déboggué et de nouvelles versions doivent être téléchargées périodiquement. Pour moi, cette idée de la machine autonome relève du fétichisme : on donne une personnalité à un appareil qui n’est pas un être sensible et qui a été – encore une fois – conçu, fabriqué, marqueté, vendu, utilisé, réparé et qui va finalement être jeté à la poubelle au profit d’un nouveau modèle.

Nous avons des machines capables de traitement automatique des symboles. Et nous ne les avons que depuis moins d’un siècle. A l’échelle de l’évolution historique, trois ou quatre générations, ce n’est presque rien. A la fin du XXe siècle, 1% de la population humaine avait accès à l’Internet et le Machine Learning était confiné dans des laboratoires scientifiques. Aujourd’hui plus de 60% de la population est branchée et le machine learning s’applique à grande échelle aux données entreposées dans le cloud. Face à cette mutation si rapide, nous avons la responsabilité d’orienter, autant que possible, le développement technique, social et culturel. Plutôt que de s’égarer dans le fantasme de la machine qui prend le pouvoir, pour le meilleur ou pour le pire, Il me semble beaucoup plus intéressant d’utiliser les machines pour une augmentation de l’intelligence humaine, intelligence à la fois personnelle et collective. C’est plutôt dans cette direction qu’il faut travailler parce que c’est la seule qui soit utile et raisonnable. Et c’est d’ailleurs ce que font en silence les principaux industriels du secteur, même si la “singularité” attire plus l’attention des foules.

Si vous visez le divin, ou le dépassement, ne tentez pas de remplacer l’homme par une machine prétendument consciente et ne craignez pas non plus un tel remplacement, parce qu’il est impossible. Ce qui est peut-être possible, en revanche, est un état de la technique et de la civilisation dans lequel l’intelligence collective humaine pourra s’observer scientifiquement, déployer et cultiver sa complexité inepuisable dans le miroir numérique. Faire travailler les machines à l’emergence d’une intelligence collective réflexive, un pas apres l’autre…

On this 17th of April, it was still winter in Ottawa. But I forgot about the cold wind and the snow when I entered the National Art Center to attend the IBM “government reinvent” event, devoted to the digital transformation in government (#IBMGovReinvent).

Like the last time I attended an IBM event, I was struck by the spirit of innovation that was blowing through the various talks, workshops and information sessions. As Alex Benay, the CIO of the Canadian federal government reminded us, government services must be citizen-centered, and available on all platforms and devices, including of course mobile devices. Moreover, we should not imagine digital transformation as an event that would take place once and for all but rather as an endless journey. The first step is to solve some urgent and painful problems in order to feel the very possibility of improvements. Then – deliberately and thoughtfully – the government should continue to improve services to citizens towards greater security, confidentiality, transparency, accessibility and so on.

Alex Benay

Image: Alex Benay, Chief Information Officer of the Canadian Federal Government

Contrary to what its name might suggest, digital transformation is primarily a problem of culture and leadership, a “human” phenomenon. It implies working in a transdisciplinary way, collaborating beyond the usual borders and above all – like in any organizational transformation process – creating trust: trust between engineers and humanists, between government departments, between citizens and government, etc. And since there are no miracles, trust is built over time and experience and it can only come from listening, honesty, competence, and transparency.

Technically, the data produced and collected must be designed from the outset to lend themselves optimally to automatic analysis and the systems must be built so that they can evolve easily. As for the famous ethical dimension of algorithms, we will not forget that the data feeding automatic learning must be examined from a critical point of view by competent humans and not blindly believed. In my understanding, a good part of the “ethical” problems related to AI comes from two essential points being forgotten:
– The tools must be designed from the user’s point of view (design thinking).
– The quality of training data determines the quality of program performance (garbage in garbage out).

Many topics were treated in this event, with numerous concrete examples of the use of new technologies: security, the blockchain, cloud, micro-services, Internet of things, etc. The theme that caught my attention the most was chatbots animated by natural language processing and deep learning. In the coming years, more and more services (governmental or commercial) will use chatbots. The public already knows Siri by Apple or Alexa by Amazon. Rather than thinking of these programs as autonomous “artificial intelligence”, it would be more useful to conceive them as media for distributing expertise, user-friendly interfaces for exploring huge databases, knowledge management tools and cognitive augmentation. Chatbots are able to understand questions in natural languages (not just recognize keywords), they can learn from their interactions, they are able to suggest relevant ideas according to context and provide a “pleasant” experience to their users.

The highlight of the day was the communication from Tanmay Bakshi, a 14-year-old genius programmer who has already created many applications from the micro-services provided by Watson (IBM’s artificial intelligence in the cloud). His enthusiasm for programming useful mental health applications and engaging young men and women in careers in AI was very inspirational!

Tanmay Bakshi

Image: Tanmay Bakshi (14 years old genius programmer),
“deep learning ambassador” for IBM.

Pas une pipe

This blog post offers a simple guide to the landscape of signification in language. We’ll begin by distinguishing the numerous elements that construct meaning. We’ll start by having a look at signs, and how they are everywhere in communication between living beings and how a sign is different from a symbol for instance. A symbol is a special kind of sign unique to humans, that folds into a signifier (a sound, an image, etc.) and a signified (a category or a concept). We’ll learn that the relationship between a signifier and a signified is conventional. A bit further, I’ll explain the workings of language, our most powerful symbolic system. I will review successively what grammar is: the recursive construction of sense units; semantics: the relations between these units; and pragmatics: the relations between speech, reference and social context. I’ll end this chapter by recalling some of the problems in fields of natural language processing (NLP).

Sign, symbol, language


Meaning involves at least three actors playing distinct roles. A sign (1) is a clue, a trace, an image, a message or a symbol (2) that means something (3) for someone.

A sign may be an entity or an event. What makes it a sign is not its intrinsic properties but the role it plays in meaning. For example, an individual can be the subject (thing) of a conversation, the interpreter of a conversation (being) or he can be a clue in an investigation (sign).

A thing, designated by a sign, is often called the object or referent, and – again –what makes it a referent is not its intrinsic properties but the role it plays in the triadic relation.

A being is often called the subject or the interpreter. It may be a human being, a group, an animal, a machine or whatever entity or process endowed with self-reference (by distinguishing self from the environment) and interpretation. The interpreter always takes the context into account when it interprets a sign. For example, a puppy (being) understands that a bite (sign) from its playful sibling is part of a game (thing) and may not be a real threat in the context.

Generally speaking, communication and signs exist for any living organisms. Cells can recognize concentrations of poison or food from afar, plants use their flowers to trick insects and birds into their reproductive processes. Animals – organisms with brains or nervous systems – practice complex semiotic games that include camouflage, dance and mimicries. They acknowledge, interpret and emit signs constantly. Their cognition is complex: the sensorimotor cycle involves categorization, feeling, and environmental mapping. They learn from experience, solve problems, communicate and social species manifest collective intelligence. All these cognitive properties imply the emission and interpretation of signs. When a wolf growls, no need to add a long discourse, a clear message is sent to its adversary.


A symbol is a sign divided into two parts: the signifier and the signified. The signified (virtual) is a general category, or an abstract class, and the signifier (actual) is a tangible phenomenon that represents the signified. A signifier may be a sound, a black mark on white paper, a trace or a gesture. For example, let’s take the word “tree” as a symbol. It is made of: 1) a signifier sound voicing the word “tree”, and 2) a signified concept that means it is part of the family of perennial plants with roots, trunk, branches, and leaves. The relationship between the signifier and the signified is conventional and depends on which symbolic system the symbol belongs to (in this case, the English language). What we mean by conventional is that in most cases, there is no analogy or causal connection between the sound and the concept: for example, between the sound “crocodile” and the actual crocodile species. We use different signifiers to indicate the same signified in different languages. Furthermore, the concepts symbolized by languages depend on the environment and culture of their speakers.

The signified of the sound “tree” is ruled by the English language and not left to the choice of the interpreter. However, it is in the context of a speech act that the interlocutor understands the referent of the word: is it a syntactic tree, a palm tree, a Christmas tree…? Let’s remember this important distinction: the signified is determined by the language but the referent depends on the context.


A language is a general symbolic system that allows humans to think reflexively, ask questions, tell stories, dialogue and engage in complex social interactions. English, French, Spanish, Arabic, Russian, or Mandarin are all natural languages. Each one of us is biologically equipped to speak and recognize languages. Our linguistic ability is natural, genetic, universal and embedded in our brain. By contrast, any language (like English, French, etc.) is based on a social, conventional and cultural environment; it is multiple, evolving and hybridizing. Languages mix and change according to the transformations of demographic, technological, economic, social and political contexts.

Our natural linguistic abilities multiply our cognitive faculties. They empower us with reflexive thinking, making it easy for us to learn and remember, to plan in the long-term and to coordinate large-scale endeavors. Language is also the basis for knowledge transmission between generations. Animals can’t understand, grasp or use linguistic symbols to their full extent, only humans can. Even the best-trained animals can’t evaluate if a story is false or exaggerated. Koko the famous gorilla will never ask you for an appointment for the first Tuesday of next month, nor will it communicate to you where its grandfather was born. In animal cognition, the categories that organize perception and action are enacted by neural networks. In human cognition, these categories may become explicit once symbolized and move to the forefront of our awareness. Ideas become objects of reflection. With human language comes arithmetic, art, religion, politics, economy, and technology. Compared to other social animal species, human collective intelligence is most powerful and creative when it is supported and augmented by its linguistic abilities. Therefore, when working in artificial intelligence or cognitive computing, it would be paramount to understand and model the functioning of neurons and neurotransmitters common to all animals, as well as the structure and organization of language, unique to our species.

I will now describe briefly how we shape meaning through language. Firstly, we will review what the grammatical units are (words, sentences, etc.). Secondly, we will explore the semantic networks between these units, and thirdly, what are the pragmatic interactions between language and extralinguistic realities.

Grammatical units

A natural language is made of recursively nested units: a phoneme which is an elementary sound, a word, a chain of phonemes, a syntagm, a chain of words and a text, a chain of syntagms. A language has a finite dictionary of words and syntactic rules for the construction of texts. With its dictionary and set of syntactic rules, a language offers its users the possibility to generate – and understand – an infinity of texts.


Humans beings can’t pronounce or recognize several phonemes simultaneously. They can only pronounce one sound at a time. So languages have to obey the constraint of sequentiality. A speech is a chain of phonemes with an acoustic punctuation reflecting its grammatical organization.

Phonemes are meaningless sounds without signification1 and generally divided into consonants and vowels. Some languages also have “click” sounding consonants (in Eastern and Southern Africa) and others (in Chinese Mandarin) use different tones on their vowels. Despite the great diversity of sounds used to pronounce human languages, the number of conventional sounds in a language is limited: the order of magnitude is between thirty and one hundred.


The first symbolic grammatical unit is the word, a signifier with a signified. By word, I mean an atomic unit of meaning. For example, “small” contains one unit of meaning. But “smallest” contains two: “small” (meaning tiny) and “est” (a superlative suffix used at the end of a word indicating the most).

Languages contain nouns depicting structures or entities, and verbs describing actions, events, and processes. Depending on the language, there are other types of words like adjectives, adverbs, prepositions or sense units that orient grammatical functions, such as gender, number, grammatical person, tense and cases.

Now let’s see how many words does a language hold? It depends. The largest English dictionary counts 200,000 words, Latin has 50,000 words, Chinese 30,000 characters and biblical Hebrew amounts to 6,000 words. The French classical author Jean Racine was able to evoke the whole range of human passions and emotions by using only 3,700 words in 13 plays. Most linguists think that whatever the language is, an educated, refined speaker masters about 10,000 words in his or her lifetime.


Note that a word alone cannot be true or false. Its signifier points to its signified (an abstract category) and not to a state of things. It is only when a sentence is spoken in a context describing a reality – a sentence with a referent – that it can be true or false.

A syntagm (a topic, sentence, and super-sentence) is a sequence of words organized by grammatical relationships. When we utter a syntagm, we leave behind the abstract dictionary of a language to enter the concrete world of speech acts in contexts. We can distinguish three sub-levels of complexity in a syntagm: the topic, the sentence, and the super-sentence. Firstly, a topic is a super-word that designates a subject, a matter, an object or a process that cannot be described by just a single word, i.e., “history of linguistics”, “smartphone” or “tourism in Canada”. Different languages have diverse rules for building topics like joining the root of a word with a grammatical case (in Latin), or agglutination of words (in German or Turkish). By relating several topics together a sentence brings to mind an event, an action or a fact, i.e., “I bought her a smartphone for her twentieth birthday”. A sentence can be verbal like in the previous example, or nominal like “the leather seat of my father’s car”. Finally, a super-sentence evokes a network of relations between facts or events, like in a theory or a narrative. The relationships between sentences can be temporal (after), spatial (behind), causal (because), logical (therefore) or underline contrasts (but, despite…), and so on.


The highest grammatical unit is a text: a punctuated sequence of syntagms. The signification of a text comes from the application of grammatical rules by combining its signifieds. The text also has a referent inferred from its temporal, spatial and social context.

In order to construct a mental model of a referent, a reader can’t help but imagine a general intention of meaning behind a text, even when it is produced by a computer program, for instance.

Semantic relationships

When we hear a speech, we are actually transforming a chain of sounds into a semantic network, and from this network, we infer a new mental model of a situation. Conversely, we are able to transform a mental model into the corresponding semantic network and then from this network, back into a sequence of phonemes. Semantics is the back and forth translation between chains of phonemes and semantic networks. Semantic networks themselves are multi-layered and can be broken down into three levels: paradigmatic, syntagmatic and textual.


Figure: Hierarchy of grammatical units and semantic relations

Paradigmatic relationships

In linguistics, a paradigm is a set of semantic relations between words of the same language. They may be etymological, taxonomical relations, oppositions or differences. These relations may be the inflectional forms of a word, like “one apple” and “two apples”. Languages may comprise paradigms to indicate verb tenses (past, present, future) or mode (active, passive). For example, the paradigm for “go” is “go, went, gone”. The notion of paradigm also indicates a set of words which cover a particular functional or thematic area. For instance, most languages include paradigms for economic actions (buy, sell, lend, repay…), or colors (red, blue, yellow…). A speaker may transform a sentence by replacing one word from a paradigm by another from the same paradigm and get a sentence that still makes sense. In the sentence “I bought a car”, you could easily replace “bought” by “sold” because “buy” and “sell” are part of the same paradigm: they have some meaning in common. But in that sentence, you can’t replace “bought” by “yellow” for instance. Two words from the same paradigm may be opposites (if you are buying, you are not selling) but still related (buying and selling can be interchangeable).

Words can also be related when they are in taxonomic relation, like “horse” and “animal”. The English dictionary describes a horse as a particular case of animal. Some words come from ancient words (etymology) or are composed of several words: for example, the word metalanguage is built from “meta” (beyond, in ancient Greek) and “language”.

In general, the conceptual relationships between words from a dictionary may be qualified as paradigmatic.

Syntagmatic relationships

By contrast, syntagmatic relations describe the grammatical connections between words in the same sentence. In the two following sentences: “The gazelle smells the presence of the lion” and “The lion smells the presence of the gazelle”, the set of words are identical but the words “gazelle” and “lion” do not share the same grammatical role. Since those words are inversed in the syntagmatic structure, the sentences have distinct meanings.

Textual relationships

At the text level, which includes several syntagms, we find semantic relations like anaphoras and isotopies. Let’s consider the super-sentence: “If a man has talent and can’t use it, he’s failed.” (Thomas Wolfe). In this quotation “it” is an anaphora for “talent” and “he”, an anaphora for “a man”. When reading a pronoun (it, he), we resolve the anaphora when we know which noun – mentioned in a previous or following sentence – it is referring to. On the other hand, isotopies are recurrences of themes that weave the unity of a text: the identity of heroes (characters), genres (love stories or historical novels), settings, etc. The notion of isotopy also encompasses repetitions that help the listener understand a text.

Pragmatic interactions

Pragmatics weave the triadic relation between signs (symbols, speeches or texts), beings (interpreters, people or interlocutors) and things (referents, objects, reality, extra-textual context). On the pragmatic level of communication, speeches point to – and act upon – a social context. A speech act functions as a move in a game played by its speaker. So, distinct from semantic meaning, that we have analyzed in a previous section, pragmatic meaning would address questions like: what kind of act (an advice, a promise, a blame, a condemnation, etc.) is carried by a speech? Is a speech spoken in a play on a stage or in a real tribunal? The pragmatic meaning of a speech also relates to the actual effects of its utterance, effects that are not always known at the moment of the enunciation. For example: “Did I convince you? Have you kept your word?”. The sense of a speech can only be understood after its utterance and future events can always modify it.

A speech act is highly dependent on cultural conventions, on the identity of speakers and attendees, time and place, etc. By proclaiming: “The session is open”, I am not just announcing that an official meeting is about to start, I am actually opening the session. But I have to be someone relevant or important like the president of that assembly to do so. If I am a janitor and I say: “The session is open”, the act is not performed because I don’t have any legitimacy to open the session.

If an utterance is descriptive, it’s either true or false. In other cases, if an utterance does something instead of describing a state of things, it has a pragmatic force instead of a truth value.

Resolving ambiguities

We have just reviewed the different layers of grammatical, semantic and pragmatic complexity to better understand the meaning of a text. Now, we are going to examine the ambiguities that may arise during the reading or listening of a text in a natural language.

Semantic ambiguities

How do we go from to the sound of a chain of phonemes to the understanding of a text? From a sequence of sounds, we build a multi-layered (paradigmatic, syntagmatic and textual) semantic network. When weaving the paradigmatic layer, we answer questions like: “What is this word? To what paradigm does it belong? Which one of its meanings should I consider?”. Then, we connect words together by answering: “What are the syntagmatic relations between the words in that sentence?”. Finally, we comprehend the text by recognizing the anaphoras and isotopies that connect its sentences. Our understanding of a text is based on this three-layered network of sense units.

Furthermore, ambiguities or uncertainties of meaning in languages can happen on all three levels and can multiply their effects. In the case of homophony, the same sound can point to different words like in “ate” and “eight”. And sometimes, the same word may convey several distinct meanings like in “mole”: (1) a shortsighted mouse-like animal digging underground galleries, (2) an undercover spy, or (3) a pigmented spot or mark on the skin. In the case of synonymy, the same meaning can apply to distinct words like “tiny” and “small”. Amphibologies refer to syntagmatic ambiguities as in: “Mary saw a woman on the mountain with a telescope.” Who is on the mountain? Moreover, who has the telescope? Mary or the woman? On a higher level of complexity, textual relations can be even more ambiguous than paradigmatic and syntagmatic ones because rules for anaphoras and isotopies are loosely defined.

Resolving semantic ambiguities in pragmatic contexts

Human beings don’t always correctly resolve all the semantic ambiguities of a speech, but when they do, it is often because they take into account the pragmatic (or extra-textual) context that is generally implicit. It’s in a context, that deictic symbols like: here, you, me, that one over there, or next Tuesday, take their full meaning. Let’s add that, comparing a text in hand with the author’s corpus, genre, historical period, helps to better discern the meaning of a text. But some pragmatic aspects of a text may remain unknown. Ambiguities can stem from many causes: the precise referents of a speech, the uncertainty of the speaker’s social interactions, the ambivalence or concealment of the speaker’s intentions, and of course not knowing in advance the effects of an utterance.

Problems in natural language processing

Computer programs can’t understand or translate texts with dictionaries and grammars alone. They can’t engage in the pragmatic context of speeches like human beings do to disambiguate texts unless this context is made explicit. Understanding a text implies building and comparing complex and dynamic mental models of text and context.

On the other hand, natural language processing (a sub-discipline of artificial intelligence) compensates for the irregularity of natural languages by using a lot of statistical calculations and deep learning algorithms that have been trained on huge corpora. Depending on its training set, an algorithm can interpret a text by choosing the most probable semantic network amongst those compatible within a chain of phonemes. Imperatively, the results have to be validated and improved by human reviewers.


I was happily surprised to be chosen as an “IBM influencer” and invited to the innovation and disruption Forum organized in Toronto the 16th of November to celebrate the 100th anniversary of IBM in Canada. With a handful of other people, I had the privilege to meet with Bryson Koehler the CTO of the IBM Cloud and Watson (Watson is the name given to IBM’s artificial intelligence). That meeting was of great interest to me: I learned a lot about the current state of cloud computing and artificial intelligence.


Image: Demonstration of a robot at the IBM innovation and disruption forum in Toronto

Contrary to other big tech companies, IBM already existed when I was born in 1956. The company was in the business of computing even before the transistors. IBM adapted itself to electronics and dominated the industry in the era of big central mainframes. It survived the PC revolution when Microsoft and Apple were kings. They navigated the turbulent waters of the social Web despite the might of Google, Facebook, and Amazon. IBM is today one of the biggest players in the market for cloud computing, artificial intelligence and business consulting.

The transitions and transformations in IBM’s history were not only technological but also cultural. In the seventies, when I was a young philosopher and new technology enthusiast, IBM was the epitome of the grey suit, blue tie, black attache-case corporate America. Now, every IBM employee – from the CEO Dino Trevisani to the salesman – wears jeans. IBM used to be the “anti-Apple” company but now everybody has a Mac laptop. Instead of proprietary technology, IBM promotes open-source software. IBM posters advertise an all-inclusive and diverse “you” across the specter of gender, race, and age. Its official management and engineering philosophy is design thinking and, along with the innovative spirit, the greatest of IBM’s virtues is the ability to listen!

Toronto’s Forum was all about innovation and disruption. Innovation is mainly about entrepreneurship: self-confidence, audacity, tenacity, resilience and market orientation. Today’s innovation is “agile”: implement a little bit, test, listen to the clients, learn from your errors, re-implement, etc. As for the disruption, it is inevitable, not only because of the speed of digital transformation but also because of the cultural shifts and the sheer succession of generations. So their argument is fairly simple: instead of being disrupted, be the disruptor! The overall atmosphere of the Forum was positive and inspirational and it was a pleasure to participate.

There were two kinds of general presentations: by IBM clients and by IBM strategists and leaders. In addition, a lot of stands, product demonstrations and informative mini-talks on various subjects enabled the attendees to learn about current issues like e-health and hospital applications, robotics, data management, social marketing, blockchain and so on. One of the highlights of the day was the interview of Arlene Dickinson (a well known Canadian TV personality, entrepreneur, and investor) by Dino Trevisani, the CEO of IBM Canada himself. Their conversation about innovation in Canada today was both instructive and entertaining.

From my point of view as a philosopher specialized in computing, Bryson Koehler (CTO for IBM cloud and Watson) made a wonderful presentation, imbued with simplicity and clarity, yet full of interesting content. Before being an IBMer Bryson worked for the Weather Channel, so he was familiar handling exabytes of data! According to Bryson Koehler, the future is not only the cloud, that is to say, infrastructure and software as a service, but also in the “cloud-native architecture“, where a lot of loosely connected mini-services can be easily assembled like Lego blocks and on top of which you can build agile and resilient applications. Bryson is convinced that all businesses are going to become “cloud natives” because they need the flexibility and security that it provides. To illustrate this, I learned that Watson is not a standalone monolithic “artificial intelligence” anymore but is now divided into several mini-services, each one with its API, and part of the IBM cloud offer alongside other services like blockchain, video storage, weather forecast, etc.

BrysonImage: Bryson Koehler at the IBM innovation and disruption Forum in Toronto

Bryson Koehler recognizes that the techniques of artificial intelligence,  the famous deep learning algorithms, in particular, are all the same amongst the big competitors (Amazon, Google, Microsoft and IBM) in the cloud business. These algorithms are now taught in universities and implemented in open source programs. So what makes the difference in IA today is not the technique but the quality and quantity of the datasets in use to train the algorithms. Since every big player has access to the public data on the web and to the syndicated data (on markets, news, finance, etc.) sold by specialized companies, what makes a real difference is the *private data* that lies behind the firewall of businesses. So what is the competitive advantage of IBM? Bryson Koehler sees it in the trust that the company inspires to its clients, and their willingness to confide their data to its cloud. IBM is “secure by design” and will never use a client’s dataset to train algorithms used by this client’s competitors. Everything boils down to confidence.

At lunchtime, with a dozen of other influencers, I had a conversation with researchers at Watson. I was impressed by what I learned about cognitive computing, one of IBM’s leitmotiv. Their idea is that the value is not created by replicating the human mind in a computer but in amplifying human cognition in real-world situations. In other words, Big Blue (IBM’s nickname) does not entertain the myth of singularity. It does not want to replace people with machines but help its clients to make better decisions in the workplace. There is a growing flow of data from which we can learn about ourselves and the world. Therefore we have no other choice than to automate the process of selecting the relevant information, synthesize its content and predict, as much as possible, our environment. IBM’s philosophy is grounded in intellectual humility. In this process of cognitive augmentation, nothing is perfect or definitive: people make errors, machines too, and there is always room for improvement of our models. Let’s not forget that only humans have goals, ask questions and can be satisfied. Machines are just here to help.

Once the forum was over, I was walking in front of the Ontario lake and thought about the similarity between philosophy and computer engineering: aren’t both building cognitive tools?

Toronto-boardwalkImage: walking meditation in front of the Lake Ontario after the IBM innovation and disruption Forum in Toronto

I put forward in this paper a vision for a new generation of cloud-based public communication service designed to foster reflexive collective intelligence. I begin with a description of the current situation, including the huge power and social shortcomings of platforms like Google, Apple, Facebook, Amazon, Microsoft, Alibaba, Baidu, etc. Contrasting with the practice of these tech giants, I reassert the values that are direly needed at the foundation of any future global public sphere: openness, transparency and commonality. But such ethical and practical guidelines are probably not powerful enough to help us crossing a new threshold in collective intelligence. Only a disruptive innovation in cognitive computing will do the trick. That’s why I introduce “deep meaning” a new research program in artificial intelligence, based on the Information Economy  MetaLanguage (IEML). I conclude this paper by evoking possible bootstrapping scenarii for the new public platform.

The rise of platforms

At the end of the 20th century, one percent of the human population was connected to the Internet. In 2017, more than half the population is connected. Most of the users interact in social media, search information, buy products and services online. But despite the ongoing success of digital communication, there is a growing dissatisfaction about the big tech companies – the “Silicon Valley” – who dominate the new communication environment.

The big techs are the most valued companies in the world and the massive amount of data that they possess is considered the most precious good of our time. Silicon Valley owns the big computers: the network of physical centers where our personal and business data are stored and processed. Their income comes from their economic exploitation of our data for marketing purposes and from their sales of hardware, software or services. But they also derive considerable power from the knowledge of markets and public opinions that stems from their information control.

The big cloud companies master new computing techniques mimicking neurons when they learn a new behavior. These programs are marketed as deep learning or artificial intelligence even if they have no cognitive autonomy and need some intense training by humans before becoming useful. Despite their well known limitations, machine learning algorithms have effectively augmented the abilities of digital systems. Deep learning is now used in every economic sector. Chips specialized in deep learning are found in big data centers, smartphones, robots and autonomous vehicles. As Vladimir Putin rightly told young Russians in his speech for the first day of school in fall 2017: “Whoever becomes the leader in this sphere [of artificial intelligence] will become the ruler of the world”.

The tech giants control huge business ecosystems beyond their official legal borders and they can ruin or buy competitors. Unfortunately, the big tech rivalry prevents a real interoperability between cloud services, even if such interoperability would be in the interest of the general public and of many smaller businesses. As if their technical and economic powers were not enough, the big tech are now playing into the courts of governments. Facebook warrants our identity and warns our family and friends that we are safe when a terrorist attack or a natural disaster occurs. Mark Zuckerberg states that one of Facebook’s mission is to insure that the electoral process is fair and open in democratic countries. Google Earth and Google Street View are now used by several municipal instances and governments as their primary source of information for cadastral plans and other geographical or geospatial services. Twitter became an official global political, diplomatic and news service. Microsoft sells its digital infrastructure to public schools. The kingdom of Denmark opened an official embassy in Silicon Valley. Cryptocurrencies independent from nation states (like Bitcoin) are becoming increasingly popular. Blockchain-based smart contracts (powered by Ethereum) bypass state authentication and traditional paper bureaucracies. Some traditional functions of government are taken over by private technological ventures.

This should not come as a surprise. The practice of writing in ancient palace-temples gave birth to government as a separate entity. Alphabet and paper allowed the emergence of merchant city-states and the expansion of literate empires. The printing press, industrial economy, motorized transportation and electronic media sustained nation-states. The digital revolution will foster new forms of government. Today, we discuss political problems in a global public space taking advantage of the web and social media and the majority of humans live in interconnected cities and metropoles. Each urban node wants to be an accelerator of collective intelligence, a smart city. We need to think about public services in a new way. Schools, universities, public health institutions, mail services, archives, public libraries and museums should take full advantage of the internet and de-silo their datasets. But we should go further. Are current platforms doing their best to enhance collective intelligence and human development? How about giving back to the general population the data produced in social media and other cloud services, instead of just monetizing it for marketing purposes ? How about giving to the people access to cognitive powers unleashed by an ubiquitous algorithmic medium?

Information wants to be open, transparent and common

We need a new kind of public sphere: a platform in the cloud where data and metadata would be our common good, dedicated to the recording and collaborative exploitation of memory in the service of our collective intelligence. The core values orienting the construction of this new public sphere should be: openness, transparency and commonality

Firstly openness has already been experimented in the scientific community, the free software movement, the creative commons licensing, Wikipedia and many more endeavors. It has been adopted by several big industries and governments. “Open by default” will soon be the new normal. Openness is on the rise because it maximizes the improvement of goods and services, fosters trust and supports collaborative engagement. It can be applied to data formats, operating systems, abstract models, algorithms and even hardware. Openness applies also to taxonomies, ontologies, search architectures, etc. A new open public space should encourage all participants to create, comment, categorize, assess and analyze its content.

Then, transparency is the very ground for trust and the precondition of an authentic dialogue. Data and people (including the administrators of a platform), should be traceable and audit-able. Transparency should be reciprocal, without distinction between the rulers and the ruled. Such transparency will ultimately be the basis for reflexive collective intelligence, allowing teams and communities of any size to observe and compare their cognitive activity

Commonality means that people will not have to pay to get access to this new public sphere: all will be free and public property. Commonality means also transversality: de-silo and cross-pollination. Smart communities will interconnect and recombine all kind of useful information: open archives of libraries and museums, free academic publications, shared learning resources, knowledge management repositories, open-source intelligence datasets, news, public legal databases…

From deep learning to deep meaning

This new public platform will be based on the web and its open standards like http, URL, html, etc. Like all current platforms, it will take advantage of distributed computing in the cloud and it will use “deep learning”: an artificial intelligence technology that employs specialized chips and algorithms that roughly mimic the learning process of neurons. Finally, to be completely up to date, the next public platform will enable blockchain-based payments, transactions, contracts and secure records

If a public platform offers the same technologies as the big tech (cloud, deep learning, blockchain), with the sole difference of openness, transparency and commonality, it may prove insufficient to foster a swift adoption, as is demonstrated by the relative failures of Diaspora (open Facebook) and Mastodon (open Twitter). Such a project may only succeed if it comes up with some technical advantage compared to the existing commercial platforms. Moreover, this technical advantage should have appealing political and philosophical dimensions.

No one really fancies the dream of autonomous machines, specially considering the current limitations of artificial intelligence. Instead, we want an artificial intelligence designed for the augmentation of human personal and collective intellect. That’s why, in addition to the current state of the art, the new platform will integrate the brand new deep meaning technology. Deep meaning will expand the actual reach of artificial intelligence, improve the user experience of big data analytics and allow the reflexivity of personal and collective intelligence.

Language as a platform

In a nutshell, deep learning models neurons and deep meaning models language. In order to augment the human intellect, we need both! Right now deep learning is based on neural networks simulation. It is enough to model roughly animal cognition (every animal species has neurons) but it is not refined enough to model human cognition. The difference between animal cognition and human cognition is the reflexive thinking that comes from language, which adds a layer of semantic addressing on top of neural connectivity. Speech production and understanding is an innate property of individual human brains. But as humanity is a social species, language is a property of human societies. Languages are conventional, shared by members of the same culture and learned by social contact. In human cognition, the categories that organize perception, action, memory and learning are expressed linguistically so they may be reflected upon and shared in conversations. A language works like the semantic addressing system of a social virtual database.

But there is a problem with natural languages (english, french, arabic, etc.), they are irregular and do not lend themselves easily to machine understanding or machine translation. The current trend in natural language processing, an important field of artificial intelligence, is to use statistical algorithms and deep learning methods to understand and produce linguistic data. But instead of using statistics, deep meaning adopts a regular and computable metalanguage. I have designed IEML (Information Economy MetaLanguage) from the beginning to optimize semantic computing. IEML words are built from six primitive symbols and two operations: addition and multiplication. The semantic relations between IEML words follow the lines of their generative operations. The total number of words do not exceed 10 000. From its dictionary, the generative grammar of IEML allows the construction of sentences at three layers of complexity: topics are made of words, phrases (facts, events) are made of topics and super-phrases (theories, narratives) are made of phrases. The higher meaning unit, or text, is a unique set of sentences. Deep meaning technology uses IEML as the semantic addressing system of a social database.

Given large datasets, deep meaning allows the automatic computing of semantic relations between data, semantic analysis and semantic visualizations. This new technology fosters semantic interoperability: it decompartmentalizes tags, folksonomies, taxonomies, ontologies and languages. When on line communities categorize, assess and exchange semantic data, they generate explorable ecosystems of ideas that represent their collective intelligence. Take note that the vision of collective intelligence proposed here is distinct from the “wisdom of the crowd” model, that assumes independent agents and excludes dialogue and reflexivity. Just the opposite : deep meaning was designed from the beginning to nurture dialogue and reflexivity.

The main functions of the new public sphere


In the new public sphere, every netizen will act as an author, editor, artist, curator, critique, messenger, contractor and gamer. The next platform weaves five functions together: curation, creation, communication, transaction and immersion.

By curation I mean the collaborative creation, edition, analysis, synthesis, visualization, explanation and publication of datasets. People posting, liking and commenting content on social media are already doing data curation, in a primitive, simple way. Active professionals in the fields of heritage preservation (library, museums), digital humanities, education, knowledge management, data-driven journalism or open-source intelligence practice data curation in a more systematic and mindful manner. The new platform will offer a consistent service of collaborative data curation empowered by a common semantic addressing system.

Augmented by deep meaning technology, our public sphere will include a semantic metadata editor applicable to any document format. It will work as a registration system for the works of the mind. Communication will be ensured by a global Twitter-like public posting system. But instead of the current hashtags that are mere sequences of characters, the new semantic tags will self-translate in all natural languages and interconnect by conceptual proximity. The blockchain layer will allow any transaction to be recorded. The platform will remunerate authors and curators in collective intelligence coins, according to the public engagement generated by their work. The new public sphere will be grounded in the internet of things, smart cities, ambient intelligence and augmented reality. People will control their environment and communicate with sensors, software agents and bots of all kinds in the same immersive semantic space. Virtual worlds will simulate the collective intelligence of teams, networks and cities.


This IEML-based platform has been developed between 2002 and 2017 at the University of Ottawa. A prototype is currently in a pre-alpha version, featuring the curation functionality. An alpha version will be demonstrated in the summer of 2018. How to bridge the gap from the fundamental research to the full scale industrial platform? Such endeavor will be much less expensive than the conquest of space and could bring a tremendous augmentation of human collective intelligence. Even if the network effect applies obviously to the new public space, small communities of pioneers will benefit immediately from its early release. On the humanistic side, I have already mentioned museums and libraries, researchers in humanities and social science, collaborative learning networks, data-oriented journalists, knowledge management and business intelligence professionals, etc. On the engineering side, deep meaning opens a new sub-field of artificial intelligence that will enhance current techniques of big data analytics, machine learning, natural language processing, internet of things, augmented reality and other immersive interfaces. Because it is open source by design, the development of the new technology can be crowdsourced and shared easily among many different actors.

Let’s draw a distinction between the new public sphere, including its semantic coordinate system, and the commercial platforms that will give access to it. This distinction being made, we can imagine a consortium of big tech companies, universities and governments supporting the development of the global public service of the future. We may also imagine one of the big techs taking the lead to associate its name to the new platform and developing some hardware specialized in deep meaning. Another scenario is the foundation of a company that will ensure the construction and maintenance of the new platform as a free public service while sustaining itself by offering semantic services: research, consulting, design and training. In any case, a new international school must be established around a virtual dockyard where trainees and trainers build and improve progressively the semantic coordinate system and other basic models of the new platform. Students from various organizations and backgrounds will gain experience in the field of deep meaning and will disseminate the acquired knowledge back into their communities.

Emission de radio (Suisse romande), 25 minutes en français.

Sémantique numérique et réseaux sociaux. Vers un service public planétaire, 1h en français

You-Tube Video (in english) 1h



Paul et Pierre - de dos

Paul, mon cousin, mon frère, mon ami,

Nous sommes nés à un an d’intervalle, presque à la même date, au milieu des années 1950, dans la communauté juive de Béja, en Tunisie.  Mon père Henri et sa sœur Nicole – ta mère – s’aimaient tendrement. Nos pères étaient associés dans la même boutique et nous jouions comme des frères dans l’arrière-boutique.

Très jeunes, l’histoire nous a balloté sur l’autre rive de la Méditerranée et nous avons atterri à Toulouse. C’est là que nos destins se sont séparés. Alors que tes parents tenaient bon et construisaient un foyer stable, j’ai été entraîné loin de l’Occitanie par les tourbillons d’un naufrage familial. Mais quand je revenais dans la ville rose visiter mon père pour les vacances de Pâques, ma tante Nicole bien aimée m’accueillait dans sa maison et elle était pour moi une véritable mère. Te souviens-tu quand nous allions ensemble à la bibliothèque, où quand tu me jouais au piano un morceau de musique que tu venais d’apprendre ?  Nous nous amusions d’un mot, d’un son, d’un geste, de tout et de rien. J’ai encore dans mes oreilles l’écho de nos rires…

Paul et Pierre au restau

Lorsque que tu faisais tes études de médecine, tu suivais en même temps des cours de philosophie à l’Université, en cachette de tes parents. Mais j’étais dans la confidence. A l’époque, nous avions d’homériques discussions sur les grands philosophes. Quand nous avons commencé à travailler et à fonder une famille, nous nous sommes un peu perdu de vue. Mais quelle fête, quelle joie, quand nous avions l’occasion de nous revoir ! Paul, tu étais ma référence, un autre moi-même, une version différente de mon destin. Nos deux vies étaient parallèles, elles rimaient comme Pierre et Paul.

Tu étais pour moi une manière de héros : tu aidais les mères à mettre leurs bébés au monde ! Médecin de garde, debout la nuit, tu opérais dans l’urgence pour sauver des vies. Consciencieux, responsable, tu étais toujours au fait des derniers développements de ta spécialité. Moi, quatre fois déraciné, j’enviais le médecin toulousain honorablement connu dans sa ville, aimé de ses patients et de leur famille.

J’aimais errer des heures dans ta bibliothèque de grand humaniste. Lucide, tu t’inquiétais partout de la tentation de la bonne conscience satisfaite. Tu étais ouvert, curieux de l’autre, mais sans jamais renier ton identité. Tu ne t’arrêtais pas à l’opinion moutonnière. Tu étais drôle, sympathique, bon vivant et généreux, mais aussi droit, honnête et authentique jusqu’à la rugosité. Je t’aimais, Paul. Qui ne t’aimait pas ? Ton humanité transparaissait immédiatement dans ton sourire et dans tes gestes.

Paul et Pierre Shabbat

Chacun a son Paul Boubli : le fils, le frère, l’époux, le père, le médecin, l’ami, le collègue… Mon Paul à moi, c’est le jumeau karmique, l’alter ego, l’âme sœur. Paul ! Notre dialogue a duré soixante ans. Mon cœur se brise mille fois à la pensée de ne plus te revoir… Rien n’efface la douleur de te perdre. Mais tu as engendré et élevé avec ta chère épouse Véronique quatre merveilleux enfants qui restent avec nous : Zacharie, Esther, Joseph et Samson. Mais tu lègues un héritage : le bien que tu as fait autour de toi, les étincelles que tu as semé dans nos vies. Par la blessure de mon cœur brisé, je recueille ces étincelles dans ma mémoire. Comprendre, aider, soigner, donner, éclairer le monde autour de soi, voilà l’exemple de courage que tu montres à chacun de nous. Toi – Prince d’une secrète noblesse andalouse – voici que de l’autre côté des larmes, de l’autre côté du temps, tu nous transmets le flambeau.

Bricologie & Sérendipité

Nous avons à résoudre des problèmes complexes au sens d’Edgar Morin : énergie, alimentation, dérèglement climatique, etc, que nous retrouvons “imbriqués” dans le domaine des transports. Individuellement, de nombreuses personnes perçoivent les enjeux et ont identifié des solutions. Mais collectivement les organisations, dans lesquelles ils évoluent, restent bloquées dans des processus et des schémas de décision, sans réelle capacité à évoluer et se transformer à la hauteur. Une des pistes pour expliquer ce paradoxe se trouve dans les mécanismes de l’intelligence collective.

L’intelligence collective est une propriété du vivant qui se manifeste quand plusieurs personnes interagissent avec un objectif commun : trouver une solution, développer un produit, réaliser une oeuvre ou une activité sportive. Un groupe de musique, une équipe de foot ou un service d’une entreprise mettent en oeuvre des actions coordonnées différentes en fonction de leur intelligence collective avec plus ou moins de réussite.

En effet, cette dernière…

View original post 1,403 more words


“Au pays de Numérix” d’Alexandre Moatti date de 2015, mais il est plus que jamais d’actualité, au moment où Mounir Mahjoubi vient d’être nommé secrétaire d’état au numérique du gouvernement Edouard Philippe. Beaucoup de gens attendent du nouveau président de la République française, jeune et réputé moderniste, un “cours nouveau” en matière de numérique en France. On ne saurait trop recommander la lecture de ce livre à son entourage.

Sur la forme c’est un ouvrage court, facile à lire, qui cultive un ton mesuré et rationnel. Il évoque le plus souvent des sujets que l’auteur connaît de première main, ce qui ne gâte rien. Franchement partisan des usages cognitifs du réseau et de “l’Internet de la connaissance” l’auteur a lui même oeuvré dans le domaine des bibliothèques numériques, a créé plusieurs sites web de type savant et participe de manière active à Wikipedia en français. Même s’il ne cite pas explicitement ces philosophes, on le sent opposé aux diatribes anti-GAFA – Google Apple Facebook Amazon – hystériques de Bernard Stiegler ou Eric Sadin, tout comme aux jugements négatifs à l’emporte pièce d’Alain Finkelkraut sur Internet. Mais il prend soin également de signaler certains aspects négatifs ou fâcheux de l’internet contemporain et de se distinguer du transhumanisme apocalyptique d’un Raymond Kurzweil ou du lyrisme a-critique d’un Pierre Lévy…

Une bonne partie de l’ouvrage est consacré aux réponses françaises et européennes au projet de Google Books autour de 2005. A l’origine, Google voulait utiliser ses centres de calcul et son algorithme de recherche pour construire une bibliothèque d’Alexandrie des temps modernes : tous les livres à disposition de tout le monde sur Internet! La France et l’Europe se devaient de relever le défi américain. Mais l’auteur montre que leurs réponses obéissent à des “effets de manche”, à des logiques d’annonce ou de communication politiques, à des stratégies de pouvoir et de captation de fonds publics par diverses institutions pour aboutir en fin de compte à d’infimes résultats concrets. Je note de mon côté que même si Google Books existe et rend des services (gratuits) au public et aux chercheurs, le projet initial est venu se fracasser sur la législation des droits d’auteurs, comme l’explique bien ce récent article de Wired. Tout cela permet de comprendre le succès d’entreprises illégales mais populaires comme la bibliothèque Genesis.

Au pays de Numérix, il y a beaucoup d’idéologie anti-américaine et anti-capitaliste… mais l’auteur montre que l’état – balkanisé par des baronnies ministérielles et institutionnelles en concurrence – travaille en fait au service d’intérêts sectoriels ou privés au lieu de mettre les capacités techniques de la France et l’argent du contribuable au service du public. Le bilan est accablant: projet après projet, les leçons des échecs ne sont jamais tirées et les mêmes erreurs sont répétées. Comme si, face à la domination de la Silicon Valley, il suffisait de s’indigner et de jeter des millions d’euros par la fenêtre pour que l’Europe ou la France (re)trouvent leur place dans le monde.

Au delà des divers projets de bibliothèques numériques européennes, Alexandre Moatti montre comment sont bloquées la collaboration des savants, la diffusion des connaissances et le rayonnement de la haute culture sur Internet. Trois coupables travaillent de conserve: la législation contemporaine des droits d’auteurs, d’ineptes politiques publiques et la rapacité des grandes maisons européennes de l’édition scientifique (Elsiever, Springer). Les arguments – de bon sens – mis en avant par Moatti ne sont pas nouveaux. Ils reprennent largement les idées du mouvement international de l’open data en général et de l’open science en particulier. Mais le réquisitoire est fort bien articulé. Il rejoint d’ailleurs les réflexions contemporaines autour de la nécessaire réinvention de l’édition scientifique (voir par exemple le récent article de Marcello Vitali-Rosati).

En refermant l’ouvrage, je n’ai pas pu m’empêcher de penser que, même s’il se trouvait à la tête de l’état français des gens conscients de l’importance capitale de l’internet au service de la connaissance et désireux de réformer les mauvaises habitudes de l’administration à cet égard, leur action ne serait pas forcément couronnée de succès. Car il faudrait faire évoluer les mentalités en profondeur, convaincre les enseignants, les journalistes, les hauts fonctionnaires. Il faudrait que la société dans son ensemble réalise que la grande transformation du numérique n’est pas seulement technique ou industrielle, mais concerne aussi et surtout le savoir et la culture. Il faudrait s’aviser que la civilisation du futur est à inventer et que cela ne se fait pas à coup de peur et de ressentiment, mais de courage, d’imagination et d’expérimentation.