“[Video] data annotation is super labor-intensive. Each hour of data collected takes almost 800 human hours to annotate. How are you going to scale that?” -Sameep Tandon, CEO of Drive.ai, an autonomous car startup in Silicon Valley and CloudFactory client

We are living in interesting times, where advancements in artificial intelligence (AI) are powering transformative technologies that are changing our everyday lives. These advancements, and specifically deep learning, have accelerated the development of computer vision applications for autonomous vehicles. Training these AI systems requires curating and preparing massive video datasets, a process that can take thousands of hours to annotate accurately.

Scaling video annotation is significantly more challenging than scaling image annotation. Just 10 minutes of video contains between 18,000 and 36,000 frames, at a rate of 30-60 frames per second. Frame-by-frame video annotation is time consuming and can be cost-prohibitive, becoming a significant roadblock for tech innovators trying to beat competition to market.

A growing number of companies, from startup to enterprise, are pairing annotation tools with an augmented workforce to scale video annotation for high-quality training datasets. They don’t want nurturing in-house teams or building a custom annotation tool to distract them from focusing on their core vision.

Crowdsourcing platforms can be a viable option that provides access to a scalable workforce and off-the-shelf annotation tools. However, crowdsourcing deploys anonymous workers, and its limited annotation tooling functionality can be a major pain point for vision-based technologies where ultra-precise data annotation is crucial for human safety.

There are a few managed workforce providers in the market with trained workers who have extensive experience doing annotation tasks and produce higher-quality training data. However, many of them require their clients to use proprietary annotation tools within their platform and restrict clients from using the annotation tool of their choice.

At CloudFactory, we recommend selecting the annotation tool that works best for your needs and maintaining it within your tech stack. There are many benefits that come with owning your video annotation tool:

  • Gain competitive advantage by establishing your own unique process for annotating data within the tool of your choice - this is often where you can spot and leverage differentiators.
  • Mitigate unintended bias in machine learning models by configuring the annotation tool according to your needs. Using a crowdsourcing provider’s off-the-shelf tool could introduce their bias in data annotation tasks.
  • Make changes to software quickly and with agility, using your own developers. You don’t have to worry about hefty fees when the software scope changes.
  • Exert greater control over security for your system. By having the tool in your stack, you can apply the exact technical controls that meet your company’s unique security requirements.
  • Select the vendors of your choice to help achieve your objectives, instead of being locked in with one provider. When you own the tool, the workforce can plug into your task workflow more easily.

Top 4 Video Annotation Tools

Computer vision algorithms require annotated data that provides a deeper understanding of the actions and interactions of different objects (individuals and groups) in each video frame. This is beyond just identifying the name and location of the object, as is the case with image annotation. There are many video annotation tools on the market to get ground truth for machine learning models. The right video annotation tool is user-friendly, minimizes human effort, and maximizes annotation quality.

Here’s a quick guide to the top four video annotation tools on the market:

Target Individual
Individual Individual Individual
Annotation Type States
States States
Boundary Shapes Ellipse
Rectangle Rectangle
Interface Color visualization for states vs. behaviors for improved tracking of targets in crowded and dynamic scenes

Switch between annotating different targets at any time
Manual, semi-automatic, and automatic annotations via user interaction with various detection algorithms Optimized for video annotation Cloud interface makes it easy to view and add annotation tasks
Agility Customize to annotate target behaviors, particularly human actions Automated tracking with interpolation for assisting manual annotation Automatic quality assurance

Flexible and suitable to be used in different application domains
Easy to install and customize

What to Look For in Your Annotation Workforce

Once you’ve selected your annotation tool, consider your workforce requirements. Video annotation is a specialized skill that requires hands-on training and coaching to achieve maximum accuracy. For best results, your workforce should be screened for proficiency with annotation tasks and receive ongoing training to improve their skills.

Whether your workforce is annotating raw video or running quality control checks on annotated video, it helps when the workforce feels like an extension of your team. Look for a workforce provider that can facilitate easy communication with your workforce to incorporate feedback and improve quality, especially when accuracy is important. Ask if your workforce can help you optimize your tool over the long term by providing feedback to improve your efficiency and user experience.

CloudFactory’s teams annotate videos for innovative companies like Cruise Automation and Drive.ai. Our workforce is trained to annotate static and moving objects frame by frame, within five pixels. We draw ultra-precise bounding boxes around objects like vehicles, pedestrians, construction roadblocks, signs, and traffic lights for autonomous driving systems. We can do timeline labeling to tag events, such as a vehicle making a right turn. We also can categorize annotated video frames for consumption by computer vision algorithms. If you need a workforce to annotate video or check the quality of your annotation, contact us.

