On December 18, during the Force Conference organized by Volcano Engine, President Tan Dai announced some groundbreaking statistics regarding the Doubao Universal ModelAs of mid-December, the daily usage of tokens on this model had surpassed a staggering 4 trillion, marking a dramatic growth of 33 times since its initial launch just seven months agoThis surge is reflective of the broader interest and reliance on AI technologies in various sectors.
Additionally, recent rankings on global monthly active users indicated that the Doubao App has reached almost 60 million users, positioning it second globally, just after OpenAI’s renowned ChatGPTThis competitive landscape underscores a rapidly expanding market for AI applications and reflects the intense interest from consumers and developers alike.
During the conference, ByteDance took the opportunity to unveil several innovative models, including the Doubao Visual Understanding Model, the Doubao 3D Generation Model, and upgraded versions of the Doubao Universal Model Pro, Music Model, and Text-to-Image Model
Particularly notable was the pricing for the Doubao Visual Understanding Model, which offered a remarkably low cost of only 3 cents for every thousand tokens usedFor just one yuan, users were able to process up to 284 images at 720P resolutionThe audience was also informed of plans to introduce an enhanced version of the Doubao Video Generation Model by spring 2025, aimed at enabling longer video generation capabilities in the futureAdditionally, an end-to-end real-time voice model for Doubao is poised to launch soon.
In a significant collaborative effort, Volcano Engine, in conjunction with Runxin Technology, is developing an AI voice toy that incorporates advanced technology from various partners, including Wi-Fi modules from Hengxuan Technology and platform interfaces from Tuya SmartThis innovation allows consumers to engage in interactive vocal exchanges with the AI toy, creatively named “Little Dog,” which can respond to inquiries, provide companionship, and more.
Industry experts expressed optimism regarding the future sales of AI voice toys
- Foreign Investors Flock to U.S. Markets
- ByteDance Builds AI Ecosystem
- Can Broadcom Compete with Nvidia in Chip Technology?
- U.S. Stocks Rally to End Tough Week
- The Index Fund Boom: What's Driving the Growth?
One insider noted that the first batch of these AI toys is expected to hit the market by the end of this year or early next year, suggesting an influx of competitors will emerge in the first half of 2024. However, he cautioned that the deployment of AI toys comes with challenges“For starters,” he argued, “AI toys must rely on high-quality knowledge bases tailored for various age groups to facilitate better human-machine interactionsFurthermore, given the frequent daily interactions users will likely have with these AI toys, the costs associated with cloud computing will represent a substantial expenditure, thus posing hurdles for widespread adoption.”
The conference also saw the initiation of the AI + Hardware Smart Leap Plan, a collaborative effort between Volcano Engine Video Cloud, Lexin Technology, and ToyCityBy leveraging the strengths of the Doubao large model, Volcano Engine's advanced dialog capabilities, ToyCity’s trendy product design, and the AI chip technology from Lexin, the aim is to accelerate the proliferation of AI toys in the consumer market
Lexin is set to deliver a comprehensive hardware solution for these toys, incorporating audio and video processing capabilities at the edge.
In the robotics sector, Horizon Robotics’ Diguo Robot is working alongside Volcano Engine’s edge cloud to develop intelligent robotics based on a large model gatewayThis collaboration aims to create intelligent perception and control systems for robots, fully utilizing the advantages of edge computing through its model gatewayThe gateway facilitates close-to-home processing of requests, enabling faster and more stable interactions with large model services, which ultimately enhances the performance of robotic devices.
Moreover, Lexin is also collaborating with Doubao to innovate applications for intelligent robots, which are currently being utilized in our scientific research, exhibition guidance, and industrial applications
Likewise, Doubao has successfully integrated its large model capabilities across a multitude of smart devices such as smartphones and PCs, reaching approximately 300 million devicesWithin just six months, the volume of calls made to the Doubao large model from these smart terminals saw a striking increase of 100 times.
Alongside this technological expansion, Volcano Engine is exploring applications for large models on PCs in partnership with IntelNotably, the Doubao model is powering features such as the “Magical Photo Editing” and AI summarization capabilities on Honor smartphones, while Vivo has adopted the Doubao Music Model to provide music creation features for users wishing to produce personalized video projects from their photo collections.
Tan Dai elaborated on the comprehensive collaboration with Android smartphone manufacturers in China, indicating that most vendors are effectively integrating Doubao into certain scenarios, while opting for alternative models in others or even using a hybrid approach
This multi-cloud or multi-model strategy is increasingly becoming commonplace among enterprise usersThe main considerations, he emphasized, ultimately come down to seeking superior capabilities at a reduced cost, making the decision quite straightforward.
At the conference venue, numerous applications being vetted through ByteDance's AI agent development platform, “Kouzi,” were on display, showcasing the breadth of innovation across various fieldsOne such example includes a partnership with Supor to explore AI-generated personalized cooking recipes, thereby improving the functionality of cooking machines.
Another collaboration with Zhizhi Cloud involved AI-driven aquarium management, where AI agents provide real-time optimization suggestions based on data from aquarium equipmentFor instance, if the water quality drops below acceptable levels, the AI can automatically adjust the operation of pumps to enhance the living conditions for fish and plants.
In a partnership with Cat King Audio, smart AI speakers are being developed to provide enhanced audio experiences
As part of this initiative, Kouzi’s developer community has already seen an impressive engagement level, gathering over one million active developers who have produced upwards of two million smart agent applications.
In the automotive sphere, numerous collaborations have arisen between Doubao and prominent auto manufacturers such as Dongfeng Motor, Zhiji Auto, and Mercedes-Benz’s SMART division for innovations in smart cockpit technologyTan highlighted that over 80% of mainstream Chinese car brands are currently engaged in partnerships involving the Doubao models.
Looking ahead, ByteDance is targeting a launch for the upgraded Doubao Video Generation Model 1.5 by the spring of 2025, which promises longer video capabilitiesIn addressing potential computing power challenges associated with such advancements, Tan asserted, “Volcano Ark provides robust MaaS (Model as a Service) inference capabilities
We have ample reserves, which is why we can offer the industry’s most substantial TPM (tokens per minute) and RPM (requests per minute). If users encounter lag or obstacles, it’s not necessarily due to insufficient computing powerThe experience depends on the performance of the application, the system architecture, and even the validation processes, all of which contribute to the seamlessness of the operational flow.”
Regarding the competitive landscape in the field of large models, Tan described the market as still in its infancy"To be honest, I am not overly concerned about competition at this early stage; there is so much yet to exploreIn fact, we are merely scratching the surface, with perhaps only a thousandth of the market truly developedAt this juncture, the focus should be on understanding unmet user needs rather than worrying too much about competitive dynamics.”