Archives for posts with tag: artificial intelligence

Sign, symbol, language

Sign

Meaning involves at least three elements playing distinct roles. A sign (1) means something (2) for some being (3). The sign may be whatever entity or event. What makes it a “sign” is not an intrinsic property but the role that it plays in meaning. The thing indicated by the sign is often called the object or the referent and, again, what makes it the referent of the sign is not an intrinsic property but rather its role in the triadic relation. As for the being, it is often called the subject or the interpreter. It can be a human being, a group, an animal, a machine or whatever entity or process endowed with self-reference (distinction self/environment) and interpretation. The interpreter takes always the context into account for its interpretation of the sign. For example, I (interpreter) smell some smoke (sign) and I infer that it comes from some fire (a referent that is part of the context).

Pas une pipe

Communication and signs clearly exist at the level of any living organisms. Cells recognize concentrations of poison or food from afar, plants use their flowers to trick insects into their reproductive processes, animal species practice complex semiotic games, including camouflage and mimicries. Animals – organisms with brains – recognize, interpret and emit signs constantly. Their cognition is already complex: it goes along the sensorimotor cycle, it involves categorization, feelings, and environment mapping. Animals learn by experience, solve problems, communicate and the social species manifest collective intelligence. All these cognitive properties imply the emission and interpretation of signs. When a wolf growls, no need to add any long discourse, a clear message is sent to its adversary.

Symbol

A symbol is a special kind of sign that is split in two: the signifier and the signified. The signified (virtual) is a general category or abstract class, and the signifier (actual) is a sensible phenomenon that represents the signified. The signifier may be for example a sound, a black mark on white paper or a gesture. The word “ tree ” is a symbol. It is made of a signifier sound and a signified category: the family of plants with root, trunk, branches, and leaves. The relation between the signifier and the signified is conventional and belongs to the symbolic system (the English language) of which the symbol is a part. What we mean by a conventional relation between signified and signifier is that, in the majority of the cases, there is no analogy or causal connection between signifier and signified, for example between the sound “ crocodile ” and the crocodile species. Different languages use different signifiers to indicate the same signified. Moreover, languages cut the reality – or define their categories – in their own way, depending on the environment and of the social games of their speakers. In our example, It is the English language who decides what is the signified of “tree”. The signified is not left to the choice of the interpreter. What the interpreter does decide is the meaning of the word in the particular context of a speech act: is the referent of the word a syntactic tree, a palm tree, a Christmas tree…?

Language

By language, I mean a complete language, a general symbolic system that allows people to think reflexively, ask questions, tell stories, dialogue and engage in complex social interaction. English, French, Spanish, Arabic, Russian, Chinese Mandarin or Esperanto are languages. Every human being is biologically equipped to speak and recognize languages. The linguistic ability is natural, genetic, embedded in our brains and universal. In contrast languages (like English, French, etc.) are social, conventional, cultural, multiple, evolving and hybridizing. They mix and change according to the transformations of demographic, technological, economic, social and political contexts.

Our natural linguistic ability multiplies the cognitive faculties that we share with other social animals. It empowers reflexive thought, lasting and precise memory, fast learning, long-term planning, large-scale complex coordination and cultural evolution. Animals cannot understand and use linguistic symbols to their full extent, only humans can. Even the best-trained gorilla will not pretend that the story of another gorilla is false or exaggerated. It will neither ask you an appointment for the first Tuesday of the next month nor inform you where its grandfather was born.

In animal cognition, the categories that organize perception and action are enacted by neural networks. In human cognition, these categories become explicit thank to symbols and move to the forefront of our awareness. Ideas become objects of reflection. With language comes arithmetics, art, religion, politics, economy, and technology. Compared to other social species, human collective intelligence is more powerful and creative because it is supported and augmented by its linguistic ability. Therefore, if we work in data science, artificial intelligence or cognitive computing, it would be useful to understand – and model – not only the functioning of neurons and neurotransmitters, common to all animals but also the structure and organization of language, unique to our species.

Natural languages contain the possibility of logical reasoning and arithmetic computing but they cannot be reduced to these features. In this sense, programming languages like Python, Javascript or C++ are too specialized to be considered as complete languages. Their basic units are empty syntactic containers. No grandmother can tell a story in Python to her grandchildren and there are no words in OWL to say “butter” or “crocodile”.

Grammar

A natural language is made of recursively nested units: a phoneme is an elementary sound, a word is a chain of phonemes, a sentence is a chain of words and a text is a chain of sentences. A language has a finite dictionary of words and syntactic rules for the construction of texts. From its dictionary and set of syntactic rules, a language offers its users the ability to generate – and understand! – an infinity of texts.

Phonemes

Humans cannot pronounce or recognize several phonemes simultaneously. They can only pronounce one sound at a time. So languages have to obey the constraint of sequentiality. A speech looks like a temporal chain of phonemes, with an acoustic punctuation reflecting its grammatical organization.

Phonemes are generally divided into consonants and vowels. Some languages have “ click ” consonants (in East and Southern Africa) and others (like Chinese) have tones on their vowels. Despite the great diversity of sounds used to pronounce human languages, the number of conventional sounds for a language is limited: the order of magnitude is between thirty and one hundred.

Words

Phonemes are meaningless sounds without any signifier associated with it. The first symbolic unit, with a signifier related to a signifier, is the word. By “ word ” I mean an atomic sense unit. For example, technically, the expression “ smallest ” contains two words: “ small ” (meaning tiny) and “ est ” (meaning the most).

How many words does a language contain? The biggest English dictionary counts 200 000 words, Latin has 50 000 words, Chinese has 30 000 characters, the biblical Hebrew amounts to 6000 words. The French classical author Jean Racine was able to evoke the whole range of human passions with only 3700 words in all of his 13 plays. Most linguists think that whatever the language, as an order of magnitude, a skillful and cultivated speaker masters around 10 000 words.

All languages contain nouns depicting structures or entities and verbs describing actions, events, and processes. Depending on the language, there are other types of words, like adjectives, adverbs, prepositions or sense units marking the grammatical functions, the gender, the number, the person, the time, etc.

Note that a word cannot be true or false. As part of a language, its signifier points to a signified, an abstract category, and not to a state of things. Only a sentence that is spoken in context and pretends to describe a reality – a sentence that has a referent – can be true or false.

Sentences

At the level of the sentence, we leave the abstract dictionary of a language to enter the concrete world of speech acts in contexts. First, let’s distinguish three sub-levels of complexity at the sentence level: the topic, the phrase, and the super-phrase. A topic is a super-word indicating a subject, a matter, an object or a process that cannot be described by one single word, i.e., “ history of linguistics ”, “ smartphone ” or “ tourism in Canada ”. Different languages have diverse rules for building topics like joining root-words to case-words or straight agglutination of words. By relating several topics, a phrase brings to mind an event, an action or a fact, i.e., “ I bought her a new smartphone for her twentieth birthday ”. A phrase can be verbal, like the previous example, or nominal like “ the blue seat of my father’s car ”. Finally, a super-phrase evokes a network of relations between facts or events, like a theory or a narrative. The relationships between phrases can be temporal (after), spatial (behind), causal (because), logical (therefore), they can underline contrasts (but, despite…) and so on.

Texts

The higher linguistic unit, or text, results from a punctuated sequence of sentences. A text has a signified resulting from the syntactic rules applied to the signifieds of its words. It also has a referent in the mind of its speaker, a referent that is inferred by its listeners from the signified of the text and from the temporal, spatial and social contexts of its utterance. Even when the text is in fact produced by a computer program, the listener cannot help but imagine an intention to mean something by a speaker and to construct the mental model of a referent.

Semantics

When we listen to a speech, we transform a chain of sounds into a semantic network and we infer from this network a new mental model of our situation. Conversely, we are able to transform a mental model into the corresponding semantic network and then this network into a train of phonemes. Semantics is the back and forth translation between chains of phonemes and semantic networks. The semantic networks themselves are multi-layered and can be broken down into three levels: paradigmatic, syntagmatic and textual.

Paradigmatic relations

In any language dictionary, words are generally arranged in paradigms. A paradigm is a set of mutually exclusive words that cover a particular functional or thematic zone. For example, languages may comprise paradigms to indicate time (past, present, future) and mode (active, passive) of verbs. Most languages include paradigms for economic actions (buy, sell, lend, repay), or colors (red, blue, yellow…). For instance, a speaker may replace a word from a paradigm by another word from the same paradigm and still make sense. In the sentence “ I bought a car ” you can replace bought by sold because buy and sell are part of the same paradigm. But you cannot replace bought by yellow. Two words from the same paradigm are both opposed (they don’t have the same meaning) and related (they are exchangeable).

Words can also be related because they are in taxonomic relation, like horse and animal. The English dictionary indicates that a horse is a particular case of an animal. Words can also be composed of smaller words, for example, “ metalanguage ” comes from meta (beyond, second order) and language.

I will not write down here a complete list of all the relations that can be found between the words of a dictionary. The main point is that the words of a language are not isolated but inter-related by a dense network of semantic connections. In dictionaries, words are always defined and explained by the way of other words. Let’s call “ paradigmatic ” – in a very general sense – the relations between the words of a language. When we hear a sentence using the word “ sold ”, we know, in an implicit way, that “ sold ” is a verb, that it is opposed to “ bought ”, that it is not “ lent ”, and that it is the past tense of “ sell ”.

Syntagmatic relations

At a particular moment in time and in a definite situation of speakers, the relations between words in a language’s dictionary are constant. But in the speeches, the relations between words change according to their syntagmatic – or grammatical – roles. In the two following sentences: “ The gazelle smells the presence of the lion ” and “ The lion smells the presence of the gazelle ” the words “ gazelle ” and “ lion ” do not share the same grammatical role, so the words are not connected according to the same syntagmatic networks… therefore the sentences have distinct meanings. Syntagmatic networks can generally be reduced to a grammatical tree of verbal and nominal sentences (search «syntactic tree» on Google image).

Textual relations

At the grammatical level, a text is just a recognizable chain of sounds. But at the semantic level, texts are interconnected by relations like linguistic anaphoras and isotopies.

A text anaphoric links relate words or sentences to pronouns, conjunctions, etc. An example of anaphora is, when we read a pronoun, we know which noun – mentioned in a previous or following sentence – it is referring to.

On the other hand, isotopies are recurrences of themes that weave the unity of a text: the identity of heroes (characters), genres (love stories or historical novels), places, etc. These redundancies are essentially about words, paradigms, sentences and sentence structures. Iso-topia means “the same topic ” in greek. The notion of isotopy also encompasses all kinds of phonetic, prosodic, syntactic and narrative repetitions that help the listener to understand the text. From a sheer sequentiality of sentences, isotopies guide us into the construction of an intra-textual semantic network.

Ambiguities

What does it mean to understand the meaning of a train of phonemes at the semantic level? It means that, from the sequence of sounds, we build a multi-layered semantic network: paradigmatic, syntagmatic and textual. When weaving the paradigmatic layer, we answer questions like: “ what is this word, to what paradigms does it belong? Which one of its senses should I consider? ” Then we connect words by responding to this kind of questions: “ what are the syntagmatic relations between the words in that sentence? ” Finally we interlace the texts by recognizing the anaphoras and isotopies that inter-connect their sentences. Our understanding of a text is this three-layered network of sense units.

Ambiguities can happen at all three levels and multiply their effects. In case of homophony, the same sound can point to two different words like “ ate ” and “ eight ”. Sometimes, one word may convey several distinct meanings like “ mole (1) ”, that means an animal digging galleries and “ mole (2) ” that means a deep undercover spy. In case of synonymy, the same meaning can be represented by distinct words like “ tiny ” and “ small ”. Amphibologies refer to syntagmatic ambiguities like in: “ Mary saw the woman on the mountain with a telescope. ” Who is on the mountain, Mary or the woman? Moreover, is it the mountain or the woman that has the telescope? Textual relations are even more ambiguous than paradigmatic and syntagmatic ones because the rules for anaphora and isotopy are loosely defined. Text understanding goes beyond grammar and vocabulary. It implies the building and comparison of complex and dynamic mental models. Human beings do not always resolve correctly all the ambiguities of speech and when they do, it is often by taking into account the pragmatic (or extra-textual) context, that is generally implicit… and out of the reach of computers.

Computers cannot understand or translate texts with the only help of a dictionary and a grammar because dictionaries and grammars of natural languages like English or Arabic have local versions, are fuzzy and evolve constantly. Moreover, textual rules change with social contexts, language games, and literary genres. Finally, computers cannot engage in the pragmatic context of speeches – like human beings do – to disambiguate texts. Natural language processing (a sub-discipline of artificial intelligence) compensate for the irregularity of natural languages by using a lot of statistical calculations and “ deep learning ” algorithms. Depending on its training set, the algorithm interprets a text by choosing the most probable semantic network. The results of these algorithms have to be validated and improved by human reviewers.

Pragmatics

The word “ pragmatics ” comes from the ancient Greek pragma: “deed, act”. In their pragmatic sense, speeches are “acts” or performances. They do something. A speech may be descriptive and, in this case, it can be true or false. But a speech may also play a lot of other social functions like order, pray, judge, promise, etc. A speech act functions as a move in a game played by its speaker. So, distinct from the semantic meaning that we have analyzed in the previous section, the pragmatic meaning of a text is related to the kind of social game that is played by the interlocutors. For example, is the text pronounced on a stage in a play or in a real tribunal? The pragmatic meaning is also related to the real effects of its utterance, effects that are unknown at the moment of the pronunciation. For example: did I convince you? Have you kept your word? In the case of meaning as “ real effect ”, the sense of a speech can only be known after its utterance and future events can always modify it. The pragmatic ambiguity of a speech act comes from the ignorance about the time and place of the utterance, from the ignorance of the precise referents of the speech, from the uncertainty about of the social game played by the speaker, from the ambivalence or concealment of the speaker’s intentions and of course from the impossibility to know in advance the effects of an utterance.

Pragmatics is all about the triadic relation between symbols (speeches or texts), interpreters (people or interlocutors) and referents (objects, reality, extra-textual context). At the pragmatic level, any speech is pointing to – and acting on – a referential context that is common to the interlocutors. The pragmatic context is used for the disambiguation of the texts’ semantics and for the actualization of its deictic symbols (like: here, you, me, that one there, or next Tuesday). Indeed, the pragmatic context is often viewed by the specialists of natural language processing from the exclusive angle of disambiguation. But in the dynamics of communication, the pragmatic context is not only a tool for disambiguation but also – and more importantly – the common object that is at stake for the participants. The pragmatic context works like a shared and synchronized memory where interlocutors “write” and “read” their speeches – or other symbolic acts – in order to transform a real social situation.

 

I was happily surprised to be chosen as an “IBM influencer” and invited to the innovation and disruption Forum organized in Toronto the 16th of November to celebrate the 100th anniversary of IBM in Canada. With a handful of other people, I had the privilege to meet with Bryson Koehler the CTO of the IBM Cloud and Watson (Watson is the name given to IBM’s artificial intelligence). That meeting was of great interest to me: I learned a lot about the current state of cloud computing and artificial intelligence.

Robot

Image: Demonstration of a robot at the IBM innovation and disruption forum in Toronto

Contrary to other big tech companies, IBM already existed when I was born in 1956. The company was in the business of computing even before the transistors. IBM adapted itself to electronics and dominated the industry in the era of big central mainframes. It survived the PC revolution when Microsoft and Apple were kings. They navigated the turbulent waters of the social Web despite the might of Google, Facebook, and Amazon. IBM is today one of the biggest players in the market for cloud computing, artificial intelligence and business consulting.

The transitions and transformations in IBM’s history were not only technological but also cultural. In the seventies, when I was a young philosopher and new technology enthusiast, IBM was the epitome of the grey suit, blue tie, black attache-case corporate America. Now, every IBM employee – from the CEO Dino Trevisani to the salesman – wears jeans. IBM used to be the “anti-Apple” company but now everybody has a Mac laptop. Instead of proprietary technology, IBM promotes open-source software. IBM posters advertise an all-inclusive and diverse “you” across the specter of gender, race, and age. Its official management and engineering philosophy is design thinking and, along with the innovative spirit, the greatest of IBM’s virtues is the ability to listen!

Toronto’s Forum was all about innovation and disruption. Innovation is mainly about entrepreneurship: self-confidence, audacity, tenacity, resilience and market orientation. Today’s innovation is “agile”: implement a little bit, test, listen to the clients, learn from your errors, re-implement, etc. As for the disruption, it is inevitable, not only because of the speed of digital transformation but also because of the cultural shifts and the sheer succession of generations. So their argument is fairly simple: instead of being disrupted, be the disruptor! The overall atmosphere of the Forum was positive and inspirational and it was a pleasure to participate.

There were two kinds of general presentations: by IBM clients and by IBM strategists and leaders. In addition, a lot of stands, product demonstrations and informative mini-talks on various subjects enabled the attendees to learn about current issues like e-health and hospital applications, robotics, data management, social marketing, blockchain and so on. One of the highlights of the day was the interview of Arlene Dickinson (a well known Canadian TV personality, entrepreneur, and investor) by Dino Trevisani, the CEO of IBM Canada himself. Their conversation about innovation in Canada today was both instructive and entertaining.

From my point of view as a philosopher specialized in computing, Bryson Koehler (CTO for IBM cloud and Watson) made a wonderful presentation, imbued with simplicity and clarity, yet full of interesting content. Before being an IBMer Bryson worked for the Weather Channel, so he was familiar handling exabytes of data! According to Bryson Koehler, the future is not only the cloud, that is to say, infrastructure and software as a service, but also in the “cloud-native architecture“, where a lot of loosely connected mini-services can be easily assembled like Lego blocks and on top of which you can build agile and resilient applications. Bryson is convinced that all businesses are going to become “cloud natives” because they need the flexibility and security that it provides. To illustrate this, I learned that Watson is not a standalone monolithic “artificial intelligence” anymore but is now divided into several mini-services, each one with its API, and part of the IBM cloud offer alongside other services like blockchain, video storage, weather forecast, etc.

BrysonImage: Bryson Koehler at the IBM innovation and disruption Forum in Toronto

Bryson Koehler recognizes that the techniques of artificial intelligence,  the famous deep learning algorithms, in particular, are all the same amongst the big competitors (Amazon, Google, Microsoft and IBM) in the cloud business. These algorithms are now taught in universities and implemented in open source programs. So what makes the difference in IA today is not the technique but the quality and quantity of the datasets in use to train the algorithms. Since every big player has access to the public data on the web and to the syndicated data (on markets, news, finance, etc.) sold by specialized companies, what makes a real difference is the *private data* that lies behind the firewall of businesses. So what is the competitive advantage of IBM? Bryson Koehler sees it in the trust that the company inspires to its clients, and their willingness to confide their data to its cloud. IBM is “secure by design” and will never use a client’s dataset to train algorithms used by this client’s competitors. Everything boils down to confidence.

At lunchtime, with a dozen of other influencers, I had a conversation with researchers at Watson. I was impressed by what I learned about cognitive computing, one of IBM’s leitmotiv. Their idea is that the value is not created by replicating the human mind in a computer but in amplifying human cognition in real-world situations. In other words, Big Blue (IBM’s nickname) does not entertain the myth of singularity. It does not want to replace people with machines but help its clients to make better decisions in the workplace. There is a growing flow of data from which we can learn about ourselves and the world. Therefore we have no other choice than to automate the process of selecting the relevant information, synthesize its content and predict, as much as possible, our environment. IBM’s philosophy is grounded in intellectual humility. In this process of cognitive augmentation, nothing is perfect or definitive: people make errors, machines too, and there is always room for improvement of our models. Let’s not forget that only humans have goals, ask questions and can be satisfied. Machines are just here to help.

Once the forum was over, I was walking in front of the Ontario lake and thought about the similarity between philosophy and computer engineering: aren’t both building cognitive tools?

Toronto-boardwalkImage: walking meditation in front of the Lake Ontario after the IBM innovation and disruption Forum in Toronto