SaraEye in action - no more wake words!
We already know what our voice assistant SaraEye - one of our artificial intelligence subprojects SaraAI - will look like.
As we wrote earlier, thanks to the received funding we have strongly accelerated.
SaraEye project is about upgrading voice assistants to a higher level by adding sight and intelligence.
You can find more information on the project website SaraAI.com/SaraEye, and here I would like to present our journey from the model to the final look.
The idea of creating Sara was born a long time ago, in the times when the Internet was in its infancy, speech recognition didn't work and there was no access to open knowledge bases. Fortunately, those limitations are behind us now, which allowed us to return to the project and start the first tests of the previously thought out assumptions. In one of our first published videos you can see our first prototype assistant made from a regular IP camera, where we show some aspects of the assistant that we would like to develop more. This only one and a half minute video, although older and amateurish, shows some key solutions, like establishing a kind of bond with the device or continuity of dialogue, which seems to us to be crucial and which we already described in another article "We are looking for Artificial Intelligence, and we get.... a speaker."
After the initial tests, seeing the limitations of using standard IP cameras, we further developed our assistant by adding a more powerful processor, a set of 6 microphones and fast motors, so that the camera could keep up with fast movement. The next hybrid version of SaraEye was born:
At the same time, we also made our first video showing some of the functionality we want to do in the already commercial version of SaraEye:
In late 2020, thanks to the funding we received for SaraEye and our collaboration with MindSailors Design Studio, we are finally creating the final shape and functionalities of SaraEye, which we will soon present in action, and at the moment we can already reveal its design:
How do you like it?
Man has always looked at the stars, asked if he was alone. In the most pessimistic option of the Drake equation there are 250,000 highly developed civilizations somewhere in the infinite universe that are able to visit us.
But we also know that the chance to get in touch with such highly developed intelligence is close to zero, and maybe that's why we'd like to create our own artificial intelligence.
Artificial intelligence has been with us for a long time, actually from the beginning ... of cinema. Most often it is shown as an ominous robot or animal-like creature. Why? Because we can't imagine something we've never seen, which is not like something that already exists. This is a huge limitation of our brain and it means that our evolution is not rapid but rather slow.
We live in a time of science and knowledge that seems to be exploding. According to Moore's law, the performance of computers has doubled every two years since the 1960s. Due to this right in a few years, the memory on your smartphone will be calculated in terabytes, and a dozen or so years later in something that does not even have a name because such large numbers have not even been used so far.
According to all these laws, soon the performance of smartphones will be greater than our brains, so I ask, what's going on?
We have 2019, powerful computers with unbelievable memories and computing power, we have powerful IT companies with billions of budgets and what do we get in 2019?
Talking loudspeaker, encyclopedia in loudspeaker, talking clock watch with completely zero intelligence.
Waiting for intelligence,it is a good idea to hibernate for a few years.
Why did we get a speaker? Why are the products of Boston Dynamics, the producer of incredibly efficient robots, in fact, ordinary remote controls?
There is one "product", maybe you can guess what, I think it's worth taking a closer look.
The "product" does not have a very good speech synthesizer, it only produces some strange voices usually in bad moments, it leaks terribly especially at the initial stage of operation, there is practically no knowledge base, you will not know from it who the president of the United States is, in fact you know nothing from it.
He performs only a few voice commands, but the dog, as it is all about him, is however the greatest friend of man.
Why such a "simple" being causes such great emotions in a person, why can we talk to it for hours, even though it does not really answer?
Why are we so excited about the "loudspeaker with AI", why, the longer we have it, the more our enthusiasm decreases, and why an ordinary dog, the longer it is with us, the more we like it?
I will tell you, because of contact, invisible threads of agreement, nonverbal, but very strong, and in this agreement one of the most important features is eye contact (eyes, faces, head movements can often show more than words).
Can't we do that now? Is it really enough to put a "speaker" and hope that people will love it?
Well, we are able to do it, in our Sara AI project, we give personality to Sara, we give senses, identity, but most importantly, we give intelligence, at the beginning a little, as much as a dog, maybe a child of several years, is that not enough? Isn't dog intelligence enough to spend hours with it? We also remember that we give intelligence of a dog or a child, but with the knowledge of the entire world database.
Without intelligence, albeit minimal, no natural language processing systems will ever be able to pretend to be the least intelligent and will always be just talking speakers.
We give it a minimum, contact, a thread of understanding, surprise, unpredictability. Not ready 3 answers to previously programmed questions. Not that way.
You get simple human answers to simple questions. If you share your impressions on a given topic, you can expect any interaction, not encyclopedic answers.
You get eye contact, a non-verbal way of communicating, you don't have to use the calling word at the beginning of each sentence. You talk to Sara as a human, so you don't have to say "Hey, Sara" to her, then wait for her to activate and keep talking. To achieve it, Sara has eyes (of course cameras), and also shakes her head and thinks.