When to use which closed captions solution for livestreams?

Clevercast supports several ways to add (multilingual) closed captions to your livestream. Deciding which is best for your event depends, among other things, on the type of live stream, its duration, and your accuracy requirements.

Of course, budget also plays a part: for manual  transcription, you’ll need someone with professional experience and (usually) a shorthand keyboard.

For each of these ways, there are two options to use our services:

  • You can get a (monthly) plan to use Clevercast as a self-service SaaS solution and find any transcribers or correctors yourself (if needed).
  • You can contact us to find transcribers or correctors for you.


Livestreams with manual captioning

Manual transcription is the traditional way to add captions to your livestream. In Clevercast, manual transcribers can work remotely. They only need an internet connection and browser. Two transcribers can take turns for the same language.

This way, it is possible to add fully accurate closed captions to a live stream in real time. Moreover, Clevercast supports certain applications that facilitate transcribing with a stenography device (contact us for more info) which further increases accuracy.

If you need closed captions in multiple languages, you have two options:

  • Manual transcription for all languages
  • Manual transcription for one language. Clevercast can automatically translate the transcription into multiple languages. As this is text-to-text translation, the degree of accuracy is quite high (= much higher than speech-to-text)


Livestreams with semi-automatic captioning

If your budget does not allow for manual transcription, speech-to-text conversion with real time correction may be the perfect choice for you. This unique Clevercast solution automatically generates the closed captions through speech-to-text conversion, but allows you to still make adjustments via an online editor before the captions are displayed in the livestream.

This method does not require professional transcriber(s) and equipment, which often is the biggest cost. Anyone using a standard keyboard and mouse can make the adjustments.

Clevercast lets correctors avoid repetitive work as much as possible. They can pass adjustments (e.g. names, abbreviations, technical phrases) to the speech-to-text conversion engine, allowing it to learn during the livestream.

Extra closed captions can be added for any number of languages through automatic text-to-text translation. The translation is based on the closed captions after correction. So if the corrector does a good job, the translated closed captions will also be accurate.

Please note that, when using speech-to-text conversion, the livestream is delivered to your viewers with a two minute delay (see the ‘Unique speech-to-text conversion method’ section below). Also note that Clevercast currently only supports this if a single language is spoken in the livestream. If multiple languages are spoken, you will need manual transcription.


Livestreams with automatic captioning

Since you don’t have to hire anyone for manual conversion or correction, this is the cheapest option.

Keep in mind that accuracy of speech-to-text conversion depends on how well the speaker can be understood. For example, a strong dialect or poor articulation will result in lower accuracy. The quality of the audio feed is important as well, preferably without too much background noise.

Although Clevercast is currently the most accurate speech-to-text solution on the market (see below for the reason why), we still recommend having someone available to make manual adjustments. This doesn’t have to be a professional or someone with experience in the matter. Even if the number of corrections is limited (e.g. technical terms, names, abbreviations) this can make the closed captions much more accurate.

Extra closed captions in additional languages (text-to-text translation) is available here as well. This would be another reason to use a corrector: any errors in the initial captions (after correction) will also be in the translated captions.


Unique speech-to-text conversion method

For the (semi) automated captioning, Clevercast uses a unique near real-time speech-to-text conversion method. Because of this, the accuracy of closed captioning in Clevercast is much higher than in any other speech-to-text solutions on the market, making life much easier for correctors.

However, this will make your viewers see the live stream with a two-minute delay. This is necessary to guarantee the highest possible quality of the closed captions. Because of the delay, the AI engine has the full speech context at its disposal during conversion, allowing it to better interpret the words and put them into sentences. Without this delay, conversions would have to rely on single words or short phrases without context, leading to a much higher number of incorrect conversions.

In addition, Clevercast allows you to set text phrases, that (could) occur in the live stream (e.g. names, abbreviations, technical terms…). Setting words or phrases (called boosting) can be done both before and during the livestream. It increases the chance that words and phrases will be displayed correctly in the captions.

Finally, the delay also allows our player to show the captions intelligently, meaning that they will appear as sentences (as much as possible). This makes reading and understanding them a lot easier.