
Sound is colorless and formless, yet warm and powerful. Its unique charm lies in its variations of tone, pauses, whispers, and soft voices. In every conversation between humans and AI, sound adds emotional depth to the exchange.
Since November 2023, the MiniMax speech large model has served over 2,000 enterprise users, providing practical solutions for more than ten scenarios, including language learning, PC voice assistants, voice chats and singing, and hyper-personalized emotional voiceovers.
Next, let’s take a look at some of the customer cases.
Customer Case 1: Haivivi BubblePal × MiniMax · Smart toys that respond to every curiosity of children. ·
Demand Scenario: How can we create a brand-new smart toy that engages in natural and smooth conversations with children? With AI technology, BubblePal can address the issue of busy parents who are unable to be with their children at all times, providing warm emotional companionship.
Solution: MiniMax's advanced voice synthesis and text model technologies make every interaction more suitable for child-friendly conversation scenarios. This enables BubblePal to always be ready to respond to children's curious questions, fully stimulating their curiosity and imagination.

Customer Case 2: Yuewen Qidian Audiobooks × MiniMax · Audiobooks Are Just That Engaging
Demand Scenario: As a leading online literature website in China, Qidian boasts a vast and diverse array of influential literary works. To enhance readers' experience and engagement with intellectual properties (IPs), Qidian aims to leverage AI technology to allow readers to enjoy the stories while experiencing the charm of sound, creating a more immersive audiobook experience so that the vast collection of great books on the site can be available in an "audio" format.
Solution: With its technical advantages in long text and ultra-long text speech generation, MiniMax can quickly understand the overall context, maintaining emotional consistency in the production of audiobooks based on lengthy novels while accurately interpreting character emotions and delivering stylized performances. For instance, in a suspense novel, whispers in the dead of night unfold the story with a deep, slightly raspy voice, synchronizing the listener's heartbeat with the rhythm of the tale; whereas in the gentle world of a romance novel, a soft, warm tone gradually guides the reader to savor the beauty of emotions.

Customer Case 3: Guagua Audiobooks × MiniMax · The Hyper-Personalized Voice Style Used by Short Video Influencers
Demand Scenario: "Guagua Audiobooks" is committed to creating an open empowerment platform in the field of long audio digital content intelligent production. They aim to provide users with a rich selection of high-quality synthetic voice options while simplifying the production process of audio content and significantly enhancing the efficiency of AI-assisted creation.
Solution: MiniMax provides "Guagua Audiobooks" with cutting-edge hyper-personalized voice synthesis technology, encompassing over 30 diverse voice styles and supporting ten languages, including Chinese, English, Spanish, Korean, and Japanese. With this technology, users can achieve highly realistic voice conversion effects, whether producing Chinese audiobooks, English podcasts, or multilingual content, ensuring an ultra-natural auditory experience.

At the same time, we have received a lot of positive user feedback, particularly regarding the text-to-speech feature.
During this period, we have continuously iterated on the functionalities of the speech model API. Users can not only easily generate and synthesize different voice styles but also fulfill multiple needs such as text-to-speech and voice cloning. We will continue to explore the infinite possibilities of AI + voice, constantly expanding the boundaries of speech model capabilities in the AIGC era, and co-create better content with AI.