Bespoken 5.3.0: Long-Form Audio, Credit Tracking, and Multisite Support
The narration of this post was created with Bespoken plugin for Craft CMS.
When I shipped Bespoken 5.2.0 earlier this month, the focus was on refining pronunciation rules, making it easier to strip unwanted characters from narration scripts and giving you visual feedback in the settings UI. That release laid important groundwork, but I knew the plugin still had a hard ceiling: if your entry text was too long, ElevenLabs would reject it outright. And if you were running a multisite install, things could break in subtle, frustrating ways. Sorry about that!
This next release tackles both of those problems head-on, along with a long list of improvements to the audio generation experience. Here’s what’s new.
Text Chunking and Audio Concatenation
This is the headline feature. Bespoken now automatically splits long text into chunks at paragraph and sentence boundaries before sending each piece to ElevenLabs. When all chunks are generated, they’re concatenated into a single MP3.
Previously, if your entry text exceeded the model’s character limit, the API call failed and told you that the text was too long. Now, Bespoken handles it transparently — you hit “Generate” and get back one seamless audio file, regardless of text length.
The chunking logic preserves paragraph boundaries through both the frontend and backend pipelines, so splits happen at natural points in your content rather than mid-sentence.
Request Stitching for Natural Transitions
Chunking alone would produce audible seams between segments — shifts in pacing, tone, or inflection at chunk boundaries. To solve this, Bespoken uses ElevenLabs’ request stitching API. Each chunk is conditioned on the surrounding text, so the model maintains consistent prosody across the full narration.
Request stitching is automatically disabled for eleven_v3 (which doesn’t support it, at least not yet) and for single-chunk generations where it’s unnecessary.
ElevenLabs Credit Display and Cost Estimates
The Bespoken field now shows your remaining ElevenLabs credits, the reset date, and a usage bar — pulled from the ElevenLabs subscription API when the page loads. Before generating audio, you can see at a glance whether you have enough credits.
Alongside this, the plugin now calculates an estimated credit cost based on your text length and the selected voice’s model. Standard models (v3, multilingual) use a 1x multiplier, while turbo and flash models cost 0.5x. If the estimate exceeds your remaining credits, you’ll see a warning before you commit to the generation.
Progress and Monitoring Improvements
With audio generation now potentially spanning multiple chunks and taking longer to complete, the progress UI needed to keep up.
Chunk Progress Reporting
The progress indicator now shows which chunk is being generated — for example, “Generating audio: chunk 3 of 7.” This gives you a clear sense of how far along a multi-chunk job is.
Complete Message History
All progress messages are now accumulated in the database. The frontend replays any unseen messages on each poll, so the full message history is always visible — even when chunks process faster than the 1‑second polling interval. You can expand the progress component to see every step of the generation pipeline.
Smarter Timeouts
The frontend job monitor now uses a stall-based timeout instead of a fixed poll count. It waits for 3 minutes of no progress change before reporting a timeout, so long multi-chunk jobs don’t falsely time out while they’re actively working. On the backend, the queue job’s time-to-reserve now scales with text length and chunk count, preventing the queue runner from killing jobs that legitimately need more time.
Stale Job Detection
If a job gets stuck in “running” status for more than 10 minutes (e.g., the queue worker was killed), it’s automatically marked as failed when you view the generation history. No more phantom “running” indicators that never resolve.
Multisite Support
This was a significant bug fix. On multisite Craft installs, generating audio on a non-primary site would fail with an “Element not found” error because the controller wasn’t resolving the correct site context. This release threads site context through the entire pipeline:
- Action URLs now explicitly include the site handle parameter
- The generation history modal filters by the current site
- The queue job carries the originating site ID
If you’re running Bespoken on a multisite install, this update should resolve the issues you’ve been hitting.
Upgrading
This update will include a database migration. Run craft migrate after updating to apply it. The migration is non-destructive — it adds columns to the existing bespoken_audiogenerations table.
If you’re using the queue for audio generation (and you should be), make sure your queue worker is running: craft queue/listen.
TLDR: Bespoken now handles long-form content, shows you what it’ll cost before you generate, works correctly on multisite installs, and gives you much better visibility into what’s happening during generation. Give it a try and let me know how it goes.