• Login
  • Contact Us
  • 877-AST-SYNC
CaptionSync logo
Automatic Sync Technologies logo
  • CaptionIT Blog
  • CaptionSync Site
You are here: Home > Quality: What to look for in closed caption solutions
Best Practices June 18, 2012

Quality: What to look for in closed caption solutions

H ere at AST, Quality comes first. You might say it’s our creed. So as we welcome you to the initial “Caption-IT” Blog, we’d like to talk about Quality. What it means in terms of technical standards and approaches, and how AST holds the standard for accessibility high — complete and accurate, and nothing less.

The key to quality captioning is to start with a quality transcript. It may seem like a trivial matter to simply transcribe what a speaker is saying … until you try it yourself.  The average speaker speaks between 120 and 150 words per minute (and even higher in some cases). Tracking and keeping up to this speed accurately is a challenge.  An audio recording, of course, has no punctuation in it, so the transcriber must use knowledge and “common sense” to determine sentence and paragraph structure. Specialized terminology and proper names represent another hurdle to accuracy.  Background noise, speaker dialect, mispronunciations, and speaker hesitations all make the task even more difficult. Getting an accurate transcript is a complex process – even for a trained and experienced transcriber.

With everyone’s attention focused on ways to reduce costs, it is very tempting to seek short-cuts in captioning.  We see a plethora of recent vendors in this field offering solutions based on speech recognition, “crowdsourcing” or student labor pools.  Use a critical eye as you examine these solutions. If it is difficult for a trained, experienced transcriber to produce an accurate transcript, then it is a daunting challenge for an untrained person (like a student, or the typical worker picking up tasks in a crowdsourcing arrangement) to successfully complete the task.

Automated Speech Recognition, of course, is the most tempting offer. Wouldn’t it be wonderful if this task could be completely automated?  There are some impressive demos from speech recognition systems and while they make the process seem flawless, these demos are not indicative of the actual performance you will get when captioning your own videos.  YouTube’s “AutoCaption” feature represents the typical quality you should expect from a speech recognition system.  Steer clear of their demo pieces and check out the performance on real videos. Here are a few I’ve run into:

  • http://www.youtube.com/watch?v=u-PtKgTXO_0
  • http://www.youtube.com/watch?v=t2yujfBeuFQ&lr=1
  • http://www.youtube.com/watch?v=hVNrkXM3TTI

That last one is one of my personal favorites. Try out AutoCaption on these. When you are listening to the audio and reading the text simultaneously in a language you know, your mind will tend to “fill-in” the errors for you. This is one of the reasons why editing error-filled speech recognition output also does not work that well. To get the real experience, try watching these videos with the audio off and the captions on – it will make the true error rate more apparent.

In today’s typical captioning task, there are rarely constraints on who the speaker is, what the topic is, or the acoustic conditions.  This results in high error rates from today’s speech recognition systems—often in excess of 20%. To put this in perspective, readers report that error rates above 3% significantly degrade the intelligibility of text, and by the time the error rate reaches 10% they report that they are unable to even discern the topic being discussed (see our Research).

The argument that “something is better than nothing” is often put forward for these low quality approaches and tools.  But this really is not a valid argument – error-filled captions are at best difficult for the viewer to follow, and at worst convey information that is simply wrong.  For public information and education content, particularly academic content, accuracy rates typical of speech-to-text tools would not meet legal accessibility guidelines, nor have they been acceptable enough to rely on for delivering academic education.

For content owners that do not want their message distorted by error, risking civil rights lawsuits or compromising academic integrity, a solution that ensures the highest quality is needed.

At AST we are sensitive to the need to contain the costs for transcription and captioning. By making extensive use of our proprietary automation technologies—but avoiding the pitfalls of speech recognition—we are proud to offer one highest quality and lowest cost solutions on the market. We believe that approaches using speech recognition technology, crowdsourcing, and untrained transcribers are simply not adequate to provide a quality result to your viewers. Captioning with a 20% error rate may provide comic relief, but it offers nothing in the way of accessibility.

As video adoption continues to increase and you are inundated with offers to caption your content using speech recognition, edited speech recognition, crowdsourcing, or student labor – view them critically. At AST, we will continue our approach that combines superior technology and trained professional transcribers, as we keep our commitment to quality captioning.

Tags: closed captions, quality

Video Search and Closed Captioning
Embedding Captions in MP4 Video

About Kevin Erler, Ph.D.

View all posts by Kevin Erler, Ph.D.
Kevin brings over 20 years of experience in speech technology research, development and delivery as well as management of large-scale commercial speech technology projects to AST.

Subscribe and Connect

Subscribe to this blog via RSS, or subscribe to our Youtube channel.

Related Posts

  • Customer Service for Closed Captioning in the Digital Age
  • Automatically Enable Features on your CaptionSync Account
  • Deaf Students are Making Their Voices Heard
  • Video Production for a Global Audience
  • Accessibility Reaches the Corner Office

No comments yet.

Click here to cancel reply.

Leave a Reply

Recent Comments

    Archives

    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • July 2012
    • June 2012

    Categories

    • AST Announcements
    • Best Practices
    • Legislation

    Sign up for our newsletter!

    Subscribe

    Subscribe to this blog via RSS, or subscribe to our Youtube channel.

    CaptionIT Blog © 2013. All Rights Reserved.