Google’s translation feature works pretty well and is a decent way of translating text and speech. However, what’s typically lost in the process of a translation is the way that the words were originally spoken, and since a lot of inference is made based on the speed of someone talks and the tone, this can sometimes lead to confusion.


However, Google thinks that they might be able to solve these problems with a new translation model called Translatotron. What it does is that it attempts to take the tone and cadence of the person talking and apply that to the translation as well, which hopefully will result in slightly more natural-sounding speech.

According to Google, “By incorporating a speaker encoder network, Translatotron is also able to retain the original speaker’s vocal characteristics in the translated speech, which makes the translated speech sound more natural and less jarring.” That being said, from the audio samples shared on Google’s blog, it is still very much obvious that it is a computer speaking back to you.

This is versus some of Google’s other AI efforts such as Duplex which has fooled many into thinking that they were talking to an actual human being. However, Translatotron is still very much in the works so we imagine that it should improve over time, but for now it does seem promising.

Filed in General. Read more about and . Source: ai.googleblog

