New AI Model Octave Enhances Text-to-Speech with Contextual Awareness
Society
AI summary
Display highlights
Hume launches Octave, an AI text-to-speech model with contextual awareness
Octave adjusts speech based on context and user instructions, conveying emotions and nuances
Users can access Octave through various tiers, including a free option with limitations
Paid tiers offer more characters and customization features to enhance text-to-speech quality
360 summary
Octave can interpret character traits and style from a script alone, adjusting vocal inflections to match implied emotions like sarcasm, urgency, or secrecy without explicit direction.
Users can granularly adjust the generated voice by typing in natural language instructions to Octave, such as "happier, sadder, more frustrated, angrier, more sarcastic, more sincere," etc.
The model can instantly create character-specific voices based on user descriptions, adjusting emotions like anger, sadness, or happiness to tailor the voice to the character, such as a sarcastic medieval peasant.
venturebeat.com
Creator Pro tier offers 500,000 characters (~500 minutes) for $50/month.
Starter tier includes 30,000 characters (~30 minutes) and supports up to 20 projects for $3/month.
Free tier allows for 10,000 characters of text-to-speech per month (~10 minutes) with unlimited custom voices.
venturebeat.com
Scale tier offers 500,000 characters with lower pricing and support for up to 3,000 projects.
Business tier provides 10,000,000 characters at even lower pricing and support for up to 20,000 projects.
Enterprise tier offers unlimited usage, custom legal terms, and priority support with significantly discounted bulk pricing.
venturebeat.com
Octave TTS allows for generating unique voices for each character in long-form content, maintaining character voices like a middle-aged orc throughout the story.
Hume AI's "Projects" page automatically chunks text to preserve character consistency and context across chapters in audiobooks.
The company has technical guardrails in place to prevent the creation of realistic children's voices and imitations of specific individuals, but otherwise allows a wide range of content, including potentially not-safe-for-work scenes.
venturebeat.com
Explore
The above information is compiled by venturebeat.com、ZDNET and does not represent any position of Arbor. It does not constitute any investment advice made by Arbor. Before making any investment decisions, investors should consider the risk factors related to the investment products based on their own circumstances and seek advice from professional investment advisors if necessary. We strive to ensure but cannot guarantee the truthfulness, accuracy, and originality of the above content, and we make no promises or guarantees in this regard. As machine learning has a probabilistic nature, it may lead to incorrect reflection of facts in certain situations. You should appropriately evaluate the accuracy of any information summary based on your usage, including through manual evaluation of the information summary. We are not responsible for any losses or liabilities incurred by you due to your use, viewing, and access of the platform or failure to do so.