The system converts sound data from video recordings into text, then transmits that text via sonar, since radio communications do not work in salt water. On the surface, the text is used to create a ‘video synthesis’, where the data is converted back into a video format with the speaker’s voice and lip movements simulated to approximate a traditional video call.
Related content
“The video then features a synthetic voice that is mapped to the voice of the person who is speaking, so that it sounds like the voice of that person,” explained research lead Professor Alex Waibel, an expert in speech translation at Karlsruhe Institute of Technology (KIT) and Carnegie Mellon University (CMU). “In addition, the video synthesis is controlled in such a way that the lips of the speaker move in sync with the sound.
“Transmitting data from a depth of four kilometres through salt water without any loss is extremely difficult.”
While video calls from four kilometres below the sea are unlikely to become commonplace, Waibel believes the system could be used to facilitate comms in extreme conditions where only low bandwidth is available.
What’s unclear at this point is the latency involved in such communications, though it is likely to be significant, making two-way conversation difficult. The system appears to be designed primarily with one-way video conferencing in mind, similar to other projects that Waibel has worked on in the past. Previous technologies the Professor has developed include the “Lecture Translator” - in use at KIT - which automatically records the lecturer’s speech and translates the data in real-time to written English text, allowing students to follow the lecture on their laptop, smartphone, or tablet.
UK productivity hindered by digital skills deficit – report
This is a bit of a nebulous subject. There are several sub-disciplines of 'digital skills' which all need different approaches. ...