We’ve made several improvements to help the platform run faster, more smoothly, and with greater reliability.
Here’s what’s new:
Word Timing Accuracy
• Applied a fix for a known API issue – word timings now align perfectly with the spoken text.
Voice Speed Control (Beta)
• You can now adjust the speaking speed in the UI (currently available for select users).
Cartesia SDK Upgrade
• Upgraded to version 2.0, which helps reduce common errors.
Pronunciation Fixes
• Removed faulty text stripping logic and improved handling of decimal points to ensure correct pronunciation.
Smarter Chunk Processing
• Chunks sent to our “Motion to Face” (M2F) service are now prioritised based on when they’re needed.
• This helps us scale GPU usage more effectively and ensures cleaner video streams under heavier load.
Reduced Frame Drops
• Fixed an issue with frame pushing during idle times. This should reduce dropped frames and improve consistency in output.
Dropped Frames Tracking
• This gives us better insight into real-time performance and helps ensure smooth delivery during high load.
Better Tracing
• Additional tracing and system refactoring improve how we detect, understand, and respond to performance issues & enabling more powerful monitoring and faster debugging.
• Resolved issues where personas stopped responding due to interrupt conflicts or bugs in upstream services.
• Fixed a race condition causing certain talk commands to fail intermittently.