AI training data operations are a lot like the assembly lines of yesterday’s factories. Data is your raw material, and you have to get it through multiple processing and review steps before it’s ready for machine learning. If you want to develop a high-performing ML model, you need smart people, tools, and operations. We hosted a webinar to discuss this topic with experts in workforce and tooling for machine learning. This is a transcript of that November 14, 2018 webinar. It includes minor edits for clarity.
“Houston, we’ve had a problem.” Astronaut Jack Swigert made the words famous when he communicated to NASA mission control that an explosion had rocked the Apollo 13 capsule that was transporting him and two other people to the moon in April 1970. To get the astronauts home safely, the engineers at Johnson Space Center in Houston, Texas would have to do something they had never attempted before: use the descent engines on the lunar lander to send it home.
NASA estimated that it took 400,000 engineers, scientists, and technicians to send astronauts to the moon on the Apollo missions. The massive workforce was comprised of people from four major enterprise companies and a host of subcontractors who worked for them.
Bringing artificial intelligence (AI) to life in the real world is a lot like the 20th-century “space race” for dominance in spaceflight capability. Few can fathom the level of innovation and sheer effort it takes. From model development and data prep to testing and deployment, AI requires a pioneering spirit, sharp minds, and a lot of hard work. AI innovators encounter countless challenges and frustrating defeats.
A Production Problem (Solved)
When Henry Ford attempted to produce the Model T at a rapid pace and with high quality, he ran into a problem. It was difficult to organize teams of specialized workers to assemble automobiles, and with so many workers needed to scale the process, it was highly inefficient. To make matters worse, late delivery of parts caused pile-ups of workers vying for space to work and delays in production.
As the volume of the world’s big data grows at a staggering speed, so too does the need for people who know how to extract knowledge, insights, or solutions from it. Today’s data scientist must have both the technical skills to solve complex data problems and the curiosity to seek out the hidden problems data can solve.
Google’s three-day I/O’18 conference in Mountainview, Calif., last week brought together developers from around the globe for hands-on learning, discussion with experts, and a look at Google’s latest developer products. The conference also featured Google I/O Extended sessions held in technology hubs across the country, including a panel discussion that featured CloudFactory Chief Revenue Officer Mike Riegel.
For all of AI’s promises, we still need people to do a lot of work behind the scenes to make it all possible. People collect, enrich, clean, and prepare data for AI systems to operate accurately and optimally. In fact, data scientists spend countless hours cleaning and combining datasets, a process commonly referred to as “data wrangling.”
One of our clients said it best: “CloudFactory has people who care.” It’s true: our teams are known for high quality work, and they love the role they play in working with the world’s most innovative technology companies. Since 2008, we have helped hundreds of companies grow their businesses by providing a highly scalable, managed workforce to be an extension of their teams.
“[Video] data annotation is super labor-intensive. Each hour of data collected takes almost 800 human hours to annotate. How are you going to scale that?”
-Sameep Tandon, CEO of Drive.ai, an autonomous car startup in Silicon Valley and CloudFactory client