Amazon has made some heavy investments in artificial intelligence and has never shied away from saying that Alexa is still a work in progress. In Alexa FAQs, the company has specifically said, “The more data we use to train these systems, the better Alexa works, and training Alexa with voice recordings from a diverse range of customers helps ensure Alexa works well for everyone.”
However, there is something that Amazon also keeps quiet about. According to an investigative report by Bloomberg, the best and the only way to make Alexa better is by having humans listen to voice recordings of the requests that are made by customers. Amazon does mention this somewhere in product and service terms and very few people would bother to read that closely.
In the past, the company has downplayed the privacy concerns about having cameras and microphones in so many houses across the globe. And it has never really spoken about how it trains the AI to work effectively for its millions of users.
It is worth mentioning that the process is known as data annotation and it has now become the base of machine learning revolution, which is responsible for making AI process languages, recognise images and objects and do so much more.
However, AI algorithms only improve when the data that they use is categorized. There are times when Alexa, or other digital assistants, don’t understand the information you are asking for. This can happen because of a number of reasons including the use of a different language or a regional slang. When such cases happen, it is humans who help the AI in understanding what the user needs by listening to the recording and adding the correct label of data. This is called supervised learning. Supervised and semi-supervised learning are two techniques that are used by Apple, Google and Facebook too in similar ways for their own digital assistants.
Coming back to Amazon, Bloomberg has shed light on the fact that thousands of employees of the technology giant are involved in parsing Alexa recordings in order to enhance the performance of the digital assistant. We should mention that the workforce that is involved in this task does not only include full-time workers of the company but some contractors too. While it is not a practice that can be termed as shocking, considering that other technology companies also do it, the report does point out correctly, is that customers often don’t realize this is happening. Furthermore, there is no word on for how long these recordings are stored. So, there is always a risk of the information being stolen or getting misused by a notorious employee of the company.
In a statement, Amazon has said to Bloomberg, “We only annotate an extremely small sample of Alexa voice recordings in order [sic] improve the customer experience. For example, this information helps us train our speech recognition and natural language understanding systems, so Alexa can better understand your requests, and ensure the service works well for everyone.”
Amazon has also added that it has “strict technical and operational safeguards, and have a zero tolerance policy for the abuse of our system.” The company has also said that its employees do not have the access to identify people talking to Alexa and all the personal information of the users is “treated with high confidentiality,” and protected by “multi-factor authentication to restrict access, service encryption, and audits of our control environment.”
However, we should point out that there have been several instances in the past when recordings have been sent to the wrong people by Alexa. And by wrong people, we basically mean other users.
Currently, Amazon is looking for ways to move away from this kind of supervised learning because it requires a lot of transcribing and annotation. According to a report that surfaced in Wired last year, the company us using new techniques in order to cut down on error rates and to expand Alexa’s knowledge base.
In an article called ‘How Alexa Learns’ that was published in Scientific American earlier this month, Ruhi Sarikaya Alexa’s director of applied science said, “In recent AI research, supervised learning has predominated. But today, commercial AI systems generate far more customer interactions than we could begin to label by hand. The only way to continue the torrid rate of improvement that commercial AI has delivered so far is to reorient ourselves toward semi-supervised, weakly supervised, and unsupervised learning. Our systems need to learn how to improve themselves.”