What—no LiDAR? Using 2-D Visual Data to Advance Autonomous Driving

In another post, we shared how light detection and ranging (LiDAR) helps autonomous vehicles respond safely to real-life driving scenarios.

How LiDAR works in autonomous driving
LiDAR is a remote-sensing technology that uses a laser beam to measure the time it takes for a laser pulse to travel from its location on one object to another object or surface. The waveform that returns to the sensor captures individual points representing peaks in the waveform. These points create 3-D point clouds—digital representations of the environment the LiDAR is capturing. Each point cloud is meticulously labeled and used to train autonomous driving models.

Industry leaders like Waymo and Cruise believe LiDAR is excellent at detecting road objects and that LiDAR data is essential to scaling autonomous vehicles.

Learn more about how LiDAR data trains autonomous driving models by reading Understanding LiDAR Data in Autonomous Driving.

Move over LiDAR; 2-D data is here

But there's another type of data advancing autonomous driving—2-D visual data. Some organizations prefer to use cameras over general 3-D imagery to reduce costs and avoid some of the limitations of LiDAR.

In this post, read about three organizations using different technologies to capture 2-D visual data to train autonomous driving models:

Driver Technologies and its dashcam app.
Tesla and its neural rendering.
NC State's MonoCon technology.

Driver Technologies uses 2-D visual data to keep the world safe

Driver Technologies (DTI) believes that everyone, regardless of income or location, should have access to life-saving car technologies. The company’s mission is to improve access to road safety technology through MADAS, its mobile advanced driving assistance system available to anyone with a smartphone.

How does the dashcam app work?

The DTI dashcam app, called Driver™, captures critical driving safety moments in 2-D visual data. Expert labelers then annotate that data with bounding boxes to train the MADAS models, which turn real-world scenarios into a robust simulation environment. Back in the vehicle, the app gives drivers peace of mind by recording their drives and warning them about potential dangers through smartphone alerts.

In the following interview, Ali Bakhshinejad and Rashid Galadanci from DTI share how 2-D image labeling served as a foundation for their software, where the company is now, and how they plan to use 2-D visual data in the future.

Tesla uses neural rendering for its data prep

Tesla's view on LiDAR is evolving. In 2019, Elon Musk is on record calling the technology a "fool's errand" and said, "anyone relying on LiDAR is doomed." He also slammed LiDAR as "expensive sensors that are unnecessary." However, in 2021 Tesla Model Y was spotted testing LiDAR sensors in Palm Beach, FL.

Rather than LiDAR, one method Tesla currently relies on is neural rendering, which uses a neural network to capture and generate 3-D imagery from 2-D snapshots. Neural rendering involves training AI algorithms to turn 2-D images into a 3-D view of a scene. It’s proving to be an effective way to remove LiDAR from the autonomous driving equation.

Neural rendering has been in use since April 2020, when researchers at UC Berkeley, UC San Diego, and Google showed that a neural network could capture a photorealistic scene in 3-D simply by viewing many 2-D images of it.

How does neural rendering work?

The algorithm exploits how light travels through the air and computes the density and color of points in 3-D space. This process makes it possible to convert 2-D images into 3-D representations you can view from any angle.

Creating a detailed, realistic 3-D scene typically requires hours of painstaking, manual work, but neural rendering makes it possible to generate these scenes from ordinary images in minutes.

MonoCon adds 3-D bounding boxes to 2-D images

Researchers at North Carolina State University are developing a non-LiDAR technology called MonoCon that improves the ability of AI programs to identify 3-D objects using 2-D images.

How do researchers manage this technique?

Researchers train their autonomous vehicle model using thousands of 2-D images with 3-D bounding boxes placed around objects in those images. The boxes are cuboids with eight points; they convey a 3-D representation of 2-D images.

Eventually, this technique will teach the model how to estimate the dimensions of each bounding box and instruct the model to predict the distance between the camera and the object.

While MonoCon is performing well at extracting 3-D objects from 2-D images to train autonomous vehicles, there's still work ahead. Researchers are focusing on increasing the size of datasets to improve training data quality.

How to make 2-D autonomous driving data valuable

For 2-D visual data to be useful as autonomous technology evolves, it must be accurately labeled—a big job that can be difficult to scale. The challenge for AI developers lies in transforming massive, raw data into large amounts of structured data to train machine learning models.

At CloudFactory, we believe that annotating 2-D visual or LiDAR data for autonomous vehicles takes more than technical skill. It also takes expertise in the autonomous driving industry and its unique data requirements.

If your organization works with 2-D image, video, sensor, or LiDAR data, know that our AV-centric workforce is ready to deliver high-quality data to your project team. We train our workforce for this annotation work through specially designed learning pathways for autonomous driving.

Interested in more? Learn how we annotate data for autonomous vehicle use cases such as robotaxis and autonomous trucking.