Revolutionizing Multilingual Video Content: Hakuhodo’s AI-Powered “KOTOBATON” Brings Authentic Voice Translation to Life

In today’s globalized world, creating engaging video content that crosses language barriers remains a critical challenge for businesses. Hakuhodo Products, based in Tokyo’s Koto Ward, has unveiled an innovative AI-driven service called KOTOBATON that promises to transform the way companies produce multilingual videos. By combining cutting-edge artificial intelligence technologies, KOTOBATON translates recorded video footage into over 30 languages—including English, Chinese, and Thai—while preserving the original speaker’s authentic voice. But what truly sets this service apart is its ability to reproduce the speaker’s lip movements, making the translated video appear as if the person themselves is speaking the foreign language. This breakthrough opens up exciting possibilities for international PR campaigns and training videos for foreign employees, making communication more natural and effective across borders.

the Original

Hakuhodo Products has launched an AI-based video translation service named KOTOBATON. This technology uses a combination of multiple AI tools to translate existing recorded videos into more than 30 languages. Unlike typical dubbing or subtitles, KOTOBATON maintains the speaker’s actual voice while generating synchronized lip movements that match the translated speech. This creates highly realistic videos where the original speaker seems to be directly communicating in the target language. The system supports a wide range of languages, including English, Chinese, and Thai, among others. Target markets include overseas PR content and training materials for foreign staff, where natural and convincing communication is crucial. By eliminating the need for reshooting or manual dubbing, KOTOBATON promises significant cost and time savings.

What Undercode Say:

The emergence of KOTOBATON signifies a monumental step forward in AI-assisted video localization. Traditional approaches to multilingual video content often involve costly reshoots, awkward voice-overs, or distracting subtitles that can reduce viewer engagement. Hakuhodo’s solution tackles these issues by fusing speech synthesis, natural language translation, and video generation to create seamless multilingual presentations that feel genuinely personal and authentic.

From a business perspective, this technology addresses the growing demand for localized content in an era where companies must communicate across diverse markets instantly and at scale. The ability to maintain the speaker’s original voice tone and emotion—while synchronizing lip movement in a natural way—raises the bar for what is possible in global marketing and internal corporate communication.

Moreover, KOTOBATON offers immense utility beyond marketing. For international companies with multicultural workforces, the ability to provide training videos and internal communications in employees’ native languages without losing the original speaker’s presence could enhance comprehension and engagement dramatically.

However, challenges remain. Although AI has made huge strides, perfectly natural lip sync and emotional nuance in speech synthesis are still evolving fields. Early adopters must balance enthusiasm with realistic expectations regarding current AI limitations. Additionally, ethical considerations about the manipulation of authentic voices and faces call for clear transparency and consent policies.

In sum, Hakuhodo’s KOTOBATON combines technology innovation with practical business use cases, heralding a new era in multilingual video content creation that is faster, more authentic, and more cost-effective.

Fact Checker Results ✅

Hakuhodo Products has officially launched an AI-powered multilingual video translation service named KOTOBATON.
The service supports over 30 languages, including English, Chinese, and Thai, and reproduces the speaker’s original voice.
The AI technology synchronizes lip movements to the translated audio, making it appear as if the speaker is naturally talking in the foreign language.

📊 Prediction: The Future of AI-Driven Multilingual Video Content

As AI continues to evolve rapidly, services like KOTOBATON will likely become standard tools for global businesses. We can expect deeper integration with real-time translation for live broadcasts, more refined emotional expression in synthesized voices, and broader language support. This technology will democratize content localization, enabling even small companies to reach international audiences with authentic, culturally relevant video content. However, transparency and ethical guidelines will be critical to ensure user trust as AI-generated video becomes more indistinguishable from reality. In the coming 3-5 years, the line between original and AI-translated video content may blur entirely, fundamentally changing how brands and organizations communicate across the world.