When to use which closed captions solution?

Clevercast supports several ways to add (multilingual) closed captions to your livestream. Deciding which approach is best suited for your event depends, among other things, on the type of live stream, its duration, and your accuracy requirements.

Of course, budget also plays a part: for manual real-time transcription, you’ll need someone with professional experience and (usually) a shorthand keyboard.


Live streams with perfect accuracy

If your budget allows for it, manual transcription is the best choice. Manual transcribers can work remotely. They only need access to the transcription room in Clevercast to add closed captions in real time to the livestream. If the live stream lasts longer or is more complex, several transcribers can work together on the same stream.

This way, it is possible to add fully accurate closed captions to a live stream in real time. Moreover, Clevercast supports certain applications that facilitate transcribing with a stenography device (contact us for more info) which further increases accuracy.

If you need closed captions in multiple languages, you have two options:

  • Manual transcription for all languages
  • Manual transcription for one language. Clevercast can automatically translate the transcription into multiple languages. As this is text-to-text translation, the degree of accuracy is quite high (= much higher than speech-to-text). However, there will still be some errors left (e.g. literal translations or product names that are not recognized as such).


Floor audio with multiple languages

Currently, Clevercast only supports speech-to-text conversion for single-floor-language live streams. If multiple languages are spoken in your live stream, you will need manual transcription for at least one of those languages. For closed captions in the other languages, you can choose between additional manual transcription or automatic text-to-text translation.


Lengthy or budget streams with good accuracy

If your budget does not allow for manual transcription, speech-to-text conversion with manual correction may be a good option. In this case, Clevercast automatically generates closed captions for the source language, but before the captions are shown, you can make adjustments in the editor.

This method does not require a professional transcriber (or several, since they usually work in pairs) and equipment. Any person using a standard keyboard and mouse can, where necessary, make the adjustments.

For longer streams, Clevercast offers an additional advantage for repetitive work. The corrector can transmit adjustments (e.g. names, abbreviations, technical phrases) to the speech-to-text conversion in real time, which will cause it to learn gradually. In turn, this increases accuracy.

Closed captions in additional languages are made through automatic text-to-text translation. As the source for this translation, the closed captions after correction are used. The quality of the source is therefore crucial: qualitative closed captions after correction make for accurate translations.


Speech-to-text without manual correction

Since you don’t have to hire anyone for manual conversion or correction, this is the cheapest option. Clevercast opted for a unique near real-time speech-to-text conversion, in order to render the closed captions as accurately as possible.

Viewers see the live stream with a two-minute delay. This is necessary to guarantee the quality of the closed captions. During this delay, the AI engine is able to ingest the entire context at the time of conversion, which in turn leads to it interpreting the words correctly and then placing them in a sentence. Without this delay, conversion would often have to occur before the speaker is finished with a sentence, leading to incorrect choices of words and phrases.

In addition, Clevercast allows you to set text phrases, that (could) occur in the live stream (e.g. names, abbreviations, technical terms…). Setting words or phrases (called boosting) can be done both before and during the livestream. It increases the chance that these phrases will be displayed correctly in the captions.

Finally, Clevercast’s near real-time streaming allows for showing captions intelligently, meaning captions usually appear as (parts of) sentences. This makes reading and understanding them a lot easier.

In current market supply, Clevercast is the most accurate solution for automatic speech-to-text conversion. However, it remains to be the case that accuracy depends primarily on how well the speaker can be understood. For example, a strong dialect or poor articulation will result in lower accuracy. The quality of the audio feed is important as well, preferably without too much background noise.

Closed captions for additional languages through automatic text-to-text translation remains available as well.