Microsoft planning voice cloning tool for Teams meetings
Microsoft has announced plans to introduce voice cloning for Teams, allowing users to simulate their voices in multiple languages during meetings.
Revealed at Microsoft Ignite 2024, the new feature, called Interpreter in Teams, promises real-time speech-to-speech interpretation in nine languages: English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese, and Spanish. The feature will roll out to Microsoft 365 subscribers in early 2025.
Microsoft highlighted the tool's potential to enhance multilingual communication in business environments. It clarified that Interpreter does not store biometric data, add artificial sentiments, or retain information beyond the original voice's tone. Users will have the option to disable the tool via Teams settings.
Balancing innovation and security concerns
Voice cloning technology is gaining traction across tech industries. Meta, for example, recently piloted a translation tool for Instagram Reels, while companies like ElevenLabs provide multilingual speech generation platforms. However, the rise of such technology has been accompanied by significant security risks.
Deepfakes — synthetic media that closely mimics real voices or images — have become a growing concern, often used to spread disinformation or execute scams. This year, scammers reportedly used voice cloning in a Teams meeting to deceive a company into wiring $25 million.
The Federal Trade Commission (FTC) in the US estimates that impersonation scams, often involving fake voices, caused over $1 billion in losses last year. The potential misuse of AI-powered tools like Interpreter raises similar concerns. Critics argue that cloned voices could be exploited to conduct fraudulent communications in different languages, further complicating efforts to combat disinformation.
Despite concerns, Interpreter in Teams represents a relatively narrow application of voice cloning, aimed at bridging language barriers in professional settings. Microsoft's assurances about data security and user control may alleviate some fears, though experts remain cautious about the broader implications.
AI translations, while cost-effective, often lack the nuance and cultural understanding of human interpreters. Colloquialisms and analogies can be lost in machine translations, potentially causing miscommunication. Nonetheless, the global market for natural language processing technologies is projected to reach $35.1 billion by 2026, reflecting growing demand for tools like Interpreter.
Comments