Picture of Courtney Wilson

Courtney Wilson

Mar 20, 2017

Autonomous Vehicles Depend on Good Data: Here's How We Help

preparing training data sets for autonomous driving

One of the coolest things for everyone working at CloudFactory is the amazing technology our customers are creating. Their spirit of innovation inspires each of us to push our own boundaries and solve new problems by thinking differently. We get to support our customers as they quite literally create the future, and that is a novelty that never wears off.

One area we’re particularly excited about is the development of AI for autonomous driving vehicles. Whether it’s working with a customer like Embark, who is creating the first ever autonomous driving trucks, or drive.ai, who is using deep learning to “build the brain of self-driving vehicles”. These are ambitious efforts that could lead to a safer, cleaner, and more productive future.

We help these customers by taking the painstaking tasks associated with preparing the massive datasets they need to fuel their algorithms. Often it comes in the form of processing thousands upon thousands of raw images, by enriching that data with labeled bounding boxes, providing scene annotation for semantic understanding, or providing 3D point cloud annotations. All of these applications hold the promise of making machine vision algorithms safer, which in turn, brings them one step closer to reality.

Recently, the New York Times featured an article showcasing how the technology works, who the players are, and what role they’re playing. We thought it would be interesting to examine how data scientists and engineers are creating an autonomous driving future.

The fact is, the future is here, as the article states…

“Autonomous cars have arrived. Uber has a fleet operating in Pittsburgh, Google’s parent company is closer to coming to market with its driverless project and the federal government has begun to issue guidelines on how the cars should work.” (From NYT Article)
How Car Drives Itself

95% of animals use vision to navigate their environment, and those pushing the boundaries of AI believe their technology should do the same. We help our customers prepare acccurate datasets for their computer vision algorithms. For instance, take the camera on top of the car in the image above. The raw images contain objects like road signs, traffic lights, or moving objects like people. To train both their recognition and decision-making algorithms we take their raw data and deliver it back with bounding boxes and labels that accurately categorize and identify those objects. These enriched images are then used to “teach” autonomous systems how to recognize the objects, and how to decide on the appropriate response.

Labeled Bounding Boxes
The car’s sensors gather data on nearby objects, like their size and rate of speed. It categorizes the objects — as cyclists, pedestrians or other cars and objects — based on how they are likely to behave. (From NYT Article)

For those using lidar, an active laser sensor system that illuminates the car’s surroundings, creating what are known as point clouds, we take those images and provide 3D annotations. This lidar data is annotated to provide accurate georeferenced coordinates that are used to replicate the reality of a car’s surroundings, creating the AI that makes the technology safer and more reliable. This enables our customers to build better learning systems, and ultimately safer autonomous vehicles.

3D Point cloud annotation

Another way we help our customer is by providing enriched data for contextual situations, otherwise known as semantic understanding. We help take image understanding from low-level image features to high level semantics by identifying objects and events providing situational understanding to help our customers create more advanced learning systems.

Semantic Understanding

These are just a few of the ways we’re helping our customers define the future of autonomous vehicles. The ugly truth is that some data scientists spend close to 80% of their time on data preparation. This is an expensive proposition when you consider that their time could be used far more efficiently by solving complex problems, instead of processing thousands of images. We’re here to free them up by offering a dependable and elastic way to prepare accurate datasets so they can focus on building incredible technology.

Better Data for Machine Vision Algorithms

Artificial Intelligence Machine Learning Data Science

Recent Posts

Subscribe to CloudFactory Blog