We already know what our voice assistant SaraEye - one of our artificial intelligence subprojects SaraAI - will look like.
As we wrote earlier, thanks to the received funding we have strongly accelerated.
SaraEye project is about upgrading voice assistants to a higher level by adding sight and intelligence.
You can find more information on the project website SaraAI.com/SaraEye, and here I would like to present our journey from the model to the final look.
The idea of creating Sara was born a long time ago, in the times when the Internet was in its infancy, speech recognition didn't work and there was no access to open knowledge bases. Fortunately, those limitations are behind us now, which allowed us to return to the project and start the first tests of the previously thought out assumptions. In one of our first published videos you can see our first prototype assistant made from a regular IP camera, where we show some aspects of the assistant that we would like to develop more. This only one and a half minute video, although older and amateurish, shows some key solutions, like establishing a kind of bond with the device or continuity of dialogue, which seems to us to be crucial and which we already described in another article "We are looking for Artificial Intelligence, and we get.... a speaker."
After the initial tests, seeing the limitations of using standard IP cameras, we further developed our assistant by adding a more powerful processor, a set of 6 microphones and fast motors, so that the camera could keep up with fast movement. The next hybrid version of SaraEye was born:
At the same time, we also made our first video showing some of the functionality we want to do in the already commercial version of SaraEye:
In late 2020, thanks to the funding we received for SaraEye and our collaboration with MindSailors Design Studio, we are finally creating the final shape and functionalities of SaraEye, which we will soon present in action, and at the moment we can already reveal its design:
How do you like it?
Man has always looked at the stars, asked if he was alone. In the most pessimistic option of the Drake equation there are 250,000 highly developed civilizations somewhere in the infinite universe that are able to visit us.
But we also know that the chance to get in touch with such highly developed intelligence is close to zero, and maybe that's why we'd like to create our own artificial intelligence.
Artificial intelligence has been with us for a long time, actually from the beginning ... of cinema. Most often it is shown as an ominous robot or animal-like creature. Why? Because we can't imagine something we've never seen, which is not like something that already exists. This is a huge limitation of our brain and it means that our evolution is not rapid but rather slow.
We live in a time of science and knowledge that seems to be exploding. According to Moore's law, the performance of computers has doubled every two years since the 1960s. Due to this right in a few years, the memory on your smartphone will be calculated in terabytes, and a dozen or so years later in something that does not even have a name because such large numbers have not even been used so far.
According to all these laws, soon the performance of smartphones will be greater than our brains, so I ask, what's going on?
We have 2019, powerful computers with unbelievable memories and computing power, we have powerful IT companies with billions of budgets and what do we get in 2019?
Talking loudspeaker, encyclopedia in loudspeaker, talking clock watch with completely zero intelligence.
Waiting for intelligence,it is a good idea to hibernate for a few years.
Why did we get a speaker? Why are the products of Boston Dynamics, the producer of incredibly efficient robots, in fact, ordinary remote controls?
There is one "product", maybe you can guess what, I think it's worth taking a closer look.
The "product" does not have a very good speech synthesizer, it only produces some strange voices usually in bad moments, it leaks terribly especially at the initial stage of operation, there is practically no knowledge base, you will not know from it who the president of the United States is, in fact you know nothing from it.
He performs only a few voice commands, but the dog, as it is all about him, is however the greatest friend of man.
Why such a "simple" being causes such great emotions in a person, why can we talk to it for hours, even though it does not really answer?
Why are we so excited about the "loudspeaker with AI", why, the longer we have it, the more our enthusiasm decreases, and why an ordinary dog, the longer it is with us, the more we like it?
I will tell you, because of contact, invisible threads of agreement, nonverbal, but very strong, and in this agreement one of the most important features is eye contact (eyes, faces, head movements can often show more than words).
Can't we do that now? Is it really enough to put a "speaker" and hope that people will love it?
Well, we are able to do it, in our Sara AI project, we give personality to Sara, we give senses, identity, but most importantly, we give intelligence, at the beginning a little, as much as a dog, maybe a child of several years, is that not enough? Isn't dog intelligence enough to spend hours with it? We also remember that we give intelligence of a dog or a child, but with the knowledge of the entire world database.
Without intelligence, albeit minimal, no natural language processing systems will ever be able to pretend to be the least intelligent and will always be just talking speakers.
We give it a minimum, contact, a thread of understanding, surprise, unpredictability. Not ready 3 answers to previously programmed questions. Not that way.
You get simple human answers to simple questions. If you share your impressions on a given topic, you can expect any interaction, not encyclopedic answers.
You get eye contact, a non-verbal way of communicating, you don't have to use the calling word at the beginning of each sentence. You talk to Sara as a human, so you don't have to say "Hey, Sara" to her, then wait for her to activate and keep talking. To achieve it, Sara has eyes (of course cameras), and also shakes her head and thinks.
On June 10th 2019, in the WeWork space at the European Hotel, a meeting of the Program Council took place before the autumn edition of # AIBigData2019, in which we took part.
Half a year in the world of new technologies is a lot of time. Therefore, the next edition of the congress promises to be fascinating, and we began work on its shape.
The most important issues raised during the June meeting:
We invite you to watch the report from the meeting.
I think most of us have seen Metalhead (episode 5, the 4th Black Mirror series). In this black-and-white episode, dogs-like machines take control of the Earth, terrorising people with their ruthlessness.
After watching this episode, probably every IT fan is immediately reminded what Boston Dynamics creates. The similarity of film killers to real robots created in this company is striking. Interestingly, the company Boston Dynamics in 2019 is going to sell these nice dogs. At first, 100-1000 items per year.
Let's also see the evolution that takes place in front of our eyes in robotics, in which Boston Dynamics is undoubtedly the leader.
Imagine that "dogs" are being sold in large ammounts in the following years and suddenly, due to a software bug or hacker attack, the apocalypse from the Black Mirror movie begins ...
I calm everyone down. NO - these are not yet "terminators".
Robots of this type will be sold with a remote control, where we control where they go. For now, they are only such modern "remotes". Although they are equipped with autonomous mode, it is still only setting a straight path from point A to point B.
It's good that artificial intelligence is not yet available ... ("it was not available so far" - Sara AI gonna think to herself soon).
When browsing the internet, we can see practically "artificial intelligence" everywhere, but can we? Do we see artificial intelligence or only two attractive marketing words?
We live in such times that AI describes many products, from sharp knives to voice assistants. Is this artificial intelligence?
If we look at different pages of description, what actually AI is, we will find there general descriptions so that everything can fit properly. I have the impression that when a large company makes for a few billion dollars new product or even any additional feature in the phone, it adds a further definition to the definition of what AI is, so that the product can be fully promoted to give in the description "AI powered".
I also have the impression that the majority of people, however, understand intuitively what it should be and what the real AI is, probably a lot of Hollywood films have a huge impact on it.
But why is it still impossible to talk with simple AI on such powerful computers, billions of dollars spent on research? Why are the best voice assistants become boring after a moment of use? Well, there are several "small" problems that have not been solved so far.
First of all, in order for programmers to write something they need to understand what to write, and unfortunately our knowledge of how the human brain works in terms of AI is almost none. We know how neurons work, how they communicate, which parts of the brain are responsible for what activities, from the psychological side of our behavior we already know quite well, but we cannot combine this knowledge to understand it, let alone describe it and copy it.
The second "small problem" is that computers are really blind, deaf, have no sense of touch, smell or taste. Imagine that a child is born without all the senses, what chance does it have to become intelligent in any degree? This is obviously an extreme case, but it is enough that the child is born blind. Blind children develop well, but they start to talk and understand much later. The sense of hearing and touch are able to quickly sharpen and help in the development of intelligence, but it must take much longer than in people with functional vision, one of the most important of our senses to explore the outside world.
Some of you probably think now: but computers have cameras and microphones. They have, but ...
The best image recognition systems available to all Google Vision, analyze the image for a very long time, see little, make thousands of basic mistakes, and the child in every second of life watches virtually 3D movie recording dozens of frames per second for many hours a day!
Microphones - here is the greatest progress, the computer is able to capture the sound direction, loudness, frequency, but the speech recognition systems are limping, unable to pick up the voice well and recognize it in the room disturbed by other sounds. Remember that even at 90% accuracy every 10th word is lost or converted to another. Try to communicate well, speaking to someone turning one word of 10 into random one not related to the topic ...
Now an explanation of what artificial intelligence in my opinion is and what it is not.
It would seem that the autonomous car of Elon Musk-Tesla, which is able to take us home from work, is an example of the development of artificial intelligence. No, this is a brilliant invention, the future of motoring, but there is no more artificial intelligence there than on any phone, i.e. it is not at all. These are simple extended algorithms that operate on the principle of implementing programmed conditions such as: when the red light is on, stop the car. Of course, it's a simplification, but it's exactly how it works. You do not really want the car to make decisions based on its experience, learning about past events, because we would not be able to predict the behavior of the car. It is better to write a pattern of rules in it than to wonder why the car suddenly turned left because it came up with such a brilliant idea. After all, we learn from mistakes, we do not let children ride a car because mistakes while driving could end tragically.
Voice assistants - ask a simple question about some activities, the effect of which is known to every child, eg "can I get into the fire?"
Voice assistants have zero IQ, how they work, and why there is no AI there, I'll explain it.
There are more developed systems that can, for example, summarize the read text. It seems that in order to be able to summarize the text, one must understand it, know what it is and know the context. Only then can it be summarized. Nothing could be more wrong - it's just statistics and enormous knowledge bases.
How does it all work now and cheats us by pretending to be AI?
Responsible for understanding our speech are systems based on Deep Learning, which are the basis of NLP (Natural Language Processing) systems. This is not a scientific article, so I will quickly summarize that there are many better or worse methods (POS tagging, Parsing, Named-Entity Recognition, Semantic Role Labeling, Sentiment Classification, Question Answering, Dialogue Systems, Contextualized Embeddings) which analyze in great summary big knowledge bases, eg database of Twitter dialogs and find the most common words, give different expressions some values and the greater the value, eg positive, the sentence is determined as positive in a given sense. Other manipulations of words, sounds or signs are also used.
It's all one big statistics that can really fool us a little.
The simplest example to understand how it works is to predict the completion of the sentence "Hungry like ..." - "a wolf", "once upon .." - "a time" etc. I know that I have simplified very much, but Google search is the same statistics. Enter a word and see the hints - it's not the AI that prompts words - it's just statistics.
If you want to know more from the technical side about NLP, read this article.
When it comes to voice assistants, it is even worse, I have the impression that there is a staff of people sitting there, who on the statistically most frequently asked questions puts in three different answers.
This is wrong way!
Identifying words in sentences, context, and predicting statistical answers, it is by no means AI.
Real AI should work on completely different principles, in which NLP is not the goal, but the means to the goal. The method of solving the problems described above to create a real AI will be described in the next article.