SuperGeekery: A blog probably of interest only to nerds by John F Morton.

A blog prob­a­bly of inter­est only to nerds by John F Mor­ton.

Bespoken 5.3.0: Long-Form Audio, Credit Tracking, and Multisite Support

The Bespoken UI now reports how many credits you have in your ElevenLabs account

The narration of this post was created with Bespoken plugin for Craft CMS.

When I shipped Bespo­ken 5.2.0 ear­li­er this month, the focus was on refin­ing pro­nun­ci­a­tion rules, mak­ing it eas­i­er to strip unwant­ed char­ac­ters from nar­ra­tion scripts and giv­ing you visu­al feed­back in the set­tings UI. That release laid impor­tant ground­work, but I knew the plu­g­in still had a hard ceil­ing: if your entry text was too long, Eleven­Labs would reject it out­right. And if you were run­ning a mul­ti­site install, things could break in sub­tle, frus­trat­ing ways. Sor­ry about that!

This next release tack­les both of those prob­lems head-on, along with a long list of improve­ments to the audio gen­er­a­tion expe­ri­ence. Here’s what’s new.

Text Chunking and Audio Concatenation

This is the head­line fea­ture. Bespo­ken now auto­mat­i­cal­ly splits long text into chunks at para­graph and sen­tence bound­aries before send­ing each piece to Eleven­Labs. When all chunks are gen­er­at­ed, they’re con­cate­nat­ed into a sin­gle MP3.

Pre­vi­ous­ly, if your entry text exceed­ed the mod­el’s char­ac­ter lim­it, the API call failed and told you that the text was too long. Now, Bespo­ken han­dles it trans­par­ent­ly — you hit Gen­er­ate” and get back one seam­less audio file, regard­less of text length.

The chunk­ing log­ic pre­serves para­graph bound­aries through both the fron­tend and back­end pipelines, so splits hap­pen at nat­ur­al points in your con­tent rather than mid-sen­tence.

Request Stitching for Natural Transitions

Chunk­ing alone would pro­duce audi­ble seams between seg­ments — shifts in pac­ing, tone, or inflec­tion at chunk bound­aries. To solve this, Bespo­ken uses Eleven­Labs’ request stitch­ing API. Each chunk is con­di­tioned on the sur­round­ing text, so the mod­el main­tains con­sis­tent prosody across the full nar­ra­tion.

Request stitch­ing is auto­mat­i­cal­ly dis­abled for eleven_v3 (which does­n’t sup­port it, at least not yet) and for sin­gle-chunk gen­er­a­tions where it’s unnec­es­sary.

ElevenLabs Credit Display and Cost Estimates

The Bespo­ken field now shows your remain­ing Eleven­Labs cred­its, the reset date, and a usage bar — pulled from the Eleven­Labs sub­scrip­tion API when the page loads. Before gen­er­at­ing audio, you can see at a glance whether you have enough cred­its.

Along­side this, the plu­g­in now cal­cu­lates an esti­mat­ed cred­it cost based on your text length and the select­ed voice’s mod­el. Stan­dard mod­els (v3, mul­ti­lin­gual) use a 1x mul­ti­pli­er, while tur­bo and flash mod­els cost 0.5x. If the esti­mate exceeds your remain­ing cred­its, you’ll see a warn­ing before you com­mit to the gen­er­a­tion.

Progress and Monitoring Improvements

With audio gen­er­a­tion now poten­tial­ly span­ning mul­ti­ple chunks and tak­ing longer to com­plete, the progress UI need­ed to keep up.

Chunk Progress Reporting

The progress indi­ca­tor now shows which chunk is being gen­er­at­ed — for exam­ple, Gen­er­at­ing audio: chunk 3 of 7.” This gives you a clear sense of how far along a mul­ti-chunk job is.

Complete Message History

All progress mes­sages are now accu­mu­lat­ed in the data­base. The fron­tend replays any unseen mes­sages on each poll, so the full mes­sage his­to­ry is always vis­i­ble — even when chunks process faster than the 1‑second polling inter­val. You can expand the progress com­po­nent to see every step of the gen­er­a­tion pipeline.

Smarter Timeouts

The fron­tend job mon­i­tor now uses a stall-based time­out instead of a fixed poll count. It waits for 3 min­utes of no progress change before report­ing a time­out, so long mul­ti-chunk jobs don’t false­ly time out while they’re active­ly work­ing. On the back­end, the queue job’s time-to-reserve now scales with text length and chunk count, pre­vent­ing the queue run­ner from killing jobs that legit­i­mate­ly need more time.

Stale Job Detection

If a job gets stuck in run­ning” sta­tus for more than 10 min­utes (e.g., the queue work­er was killed), it’s auto­mat­i­cal­ly marked as failed when you view the gen­er­a­tion his­to­ry. No more phan­tom run­ning” indi­ca­tors that nev­er resolve.

Multisite Support

This was a sig­nif­i­cant bug fix. On mul­ti­site Craft installs, gen­er­at­ing audio on a non-pri­ma­ry site would fail with an Ele­ment not found” error because the con­troller was­n’t resolv­ing the cor­rect site con­text. This release threads site con­text through the entire pipeline:

  • Action URLs now explic­it­ly include the site han­dle para­me­ter
  • The gen­er­a­tion his­to­ry modal fil­ters by the cur­rent site
  • The queue job car­ries the orig­i­nat­ing site ID

If you’re run­ning Bespo­ken on a mul­ti­site install, this update should resolve the issues you’ve been hit­ting.

Upgrading

This update will include a data­base migra­tion. Run craft migrate after updat­ing to apply it. The migra­tion is non-destruc­tive — it adds columns to the exist­ing bespoken_audiogenerations table.

If you’re using the queue for audio gen­er­a­tion (and you should be), make sure your queue work­er is run­ning: craft queue/listen.

TLDR: Bespo­ken now han­dles long-form con­tent, shows you what it’ll cost before you gen­er­ate, works cor­rect­ly on mul­ti­site installs, and gives you much bet­ter vis­i­bil­i­ty into what’s hap­pen­ing dur­ing gen­er­a­tion. Give it a try and let me know how it goes.