How to organize machine learning teams

This blog post was originally published by Hasty.ai on January 17, 2022. Hasty.ai became a CloudFactory company in September 2022.

With limitless data, inexpensive storage, and powerful processing, machine learning (ML) technology is seeing rapid adoption. Businesses across industries are building new ML solutions to increase competitiveness, improve customer service, and bring new products to market faster. But, as many are discovering, developing these solutions is a difficult endeavor. Companies new to the field face many obstacles on their journey to getting AI models into production.

This is where the organization of the ML team becomes crucial. With the right team structure, you’ll create a successful foundation for your ML project and increase the chances of getting that model into production successfully.

Team structures vary from company to company. When it comes to ML teams, there are a lot of factors that could impact how you organize the team, including the significance of ML in your tool, the size of the company, and the makeup of the team. In this post, you’ll learn:

Two popular approaches to organizing ML teams
The pros and cons of each approach
How to choose the right approach for your organization

Let’s get started.

Two approaches to organizing ML teams

There are two main frameworks companies tend to use when structuring ML teams. Let’s review each and dig into the benefits and challenges.

1. Centralized / Functional Teams

A centralized ML team consists mainly of machine learning engineers and data scientists. They work separately in their functional field and are siloed from traditional software engineering, product, or any other departments.

The benefits of centralized / functional teams:

Talent allocation: An independent group of ML engineers and data scientists means that the talent density within the ML team is very high. This allows the company to employ the best ML practitioners for any relevant task, no matter where in the organization that task originates.
Speed and knowledge share: With an independent ML team, the company can kick off any new ML ideas extremely fast. A centralized ML team also maximizes knowledge sharing across the function, allowing the development of deep knowledge, shared standards, and a shared tech stack. A high-performing centralized team can create processes and technology that can be shared with other groups. This hub approach means that over time, an organization can spread ML know-how to other teams and enable different parts of the organization to conduct their own AI experiments. Having this hub of knowledge also makes it easier to hire for new openings, as teams organized in this manner are very attractive to potential hires.
Dedicated space: A centralized team creates a clear and dedicated space for any and all ML initiatives. It creates a natural communication pathway for other departments that want to build AI solutions. With the right internal promotion, most centralized teams are very in-demand.

Pro-tip: If your company decides to create a centralized team, ensure the ML team has a seat at the table on a leadership level. They're often hard to fit in with traditional leadership structures. If AI efforts are essential to your company, it’s critical to have someone advocating for those efforts at an executive level in the organization.

Potential challenges of centralized / functional teams:

Although centralized teams are an excellent way to organize your ML workforce, this approach has some potential drawbacks.

Siloing: If the team is not integrated correctly with the rest of the organization, chances are that knowledge sharing from the team to the rest of the organization simply doesn’t happen. It’s important to ensure processes are in place for knowledge sharing and educating other teams.
Resourcing: As the team often (but not always) doesn’t have its own software development resources, it needs to work well with other teams. This shared responsibility approach can work well, but be careful to ensure that all involved teams align around the priority of work tasks and the goals of any projects.
Too many use cases: Centralized ML teams often face the problem of having too many use cases to pick from. Throughout the organization, new ideas will flow to the team. So a centralized team needs reasonable and transparent processes for prioritizing tasks and managing stakeholder requests. This makes it essential to clearly align on internal priorities. For example, it’s perfectly valid to say that you need time to build up MLOps infrastructure. This will help you develop AI projects faster down the line when you build the team. However, other departments or leaders might get frustrated by the perceived lack of progress without agreed-on expectations.
Lack of expertise: Another issue that can emerge is the lack of expertise in understanding the business. For example, you can have a very technically competent AI team. But if the team doesn’t understand what the core business of the company is, what the needs of its customers are, and the overall strategy of the business, there’s a risk that they will spend their time developing something that misses the mark. Ensuring that other teams share their knowledge with the ML team is vital.
Estimation: There is also the issue of AI projects being hard to estimate. In large organizations, where many different teams work together, there’s often a clear project plan with set deadlines and budgets. This can be difficult for an AI team to deal with, as many AI development cases are trial-and-error. It’s tough to answer questions about “How much data is needed” or “How long to develop an AI model that produces X result” without first experimenting. At the core of this problem is the black-box nature of AI. You won’t know what will work until you try. A good recommendation is to give your centralized team their own budget and ensure that any project planning takes experimentation into account.
Clear function: The ML team needs to have a clear function beyond “developing AI.” What is the core function of the team? Is it R&D of future technologies, or is it to improve business metrics in the coming months by extending the product range with AI capabilities? If their function is unclear, it can be challenging for centralized teams to contribute to what the organization expects, leading to unhappy stakeholders and employees.

2. Decentralized / squad teams

A decentralized ML team consists of an entire “feature” team made up of people from product, marketing, software engineering, design, and ML. These are often called squads and their goal is to develop a specific feature or product.

The benefits of decentralized / squad teams:

Product-centric: Decentralized teams are more product-focused and have a high concentration of product knowledge. A cross-functional team means you have all the knowledge and expertise to deliver the right thing at great speed. This also allows for easy experimentation with different ML ideas.
Independence: With these more diverse teams, you also create more independence. They don’t need to rely on other groups to build a complete product. This can be very helpful for an organization as you don’t need different teams to align on their goals and priorities and generally, it leads to faster development times. For example, when LinkedIn ML engineers couldn’t try their “recommendation engine” because they couldn’t get a front-end, they switched to a decentralized squad to test the “People You May Know” feature. The group included design, web, product marketing, engineering, and the “People You May Know” feature became one of the most successful products a LinkedIn team ever created.
Clear focus: Generally, decentralized teams have a more precise focus than centralized teams. Often, they have a clear reason for existing – creating new products and services. This makes it easier for the team to prioritize work and prevents losing focus.
Organizational understanding: As these teams are made up of people from different parts of the organization, they also tend to understand the organization at large better. For example, adding a backend engineer to the decentralized team who has worked in the company for a few years would be familiar with other tech products. They will better understand what you need (and don’t need) to integrate with it.
Knowledge sharing: Your ML engineers and data scientists might help the rest of the team grasp and improve their knowledge of machine learning. For example, a DevOps engineer can transition to being an MLOps engineer. But, it also works the other way around, with the larger team sharing knowledge on anything from how the organization works to development processes. In short, good decentralized teams often have symbiotic relationships internally that benefit all parties.

Potential challenges of decentralized / squad teams:

Research and development: The main advantages of decentralized teams are speed and independence. This makes these teams very good when doing practical work and delivering value to the organization. However, this team structure can be challenging to use if you are looking for more of an AI R&D setup.
Organizational alignment: Another potential issue to look out for is that often decentralized teams can make decisions that don't align with the organization at large. For example, decentralized teams can use a new continuous integration/continuous delivery service while the rest already have a different solution. This can make future handovers more complex and can lead to double spending.
Siloing: A third potential issue is that these teams sometimes create a silo between themselves and other development projects. As they are highly independent, it can be that they lack insight into what's happening elsewhere - which means good processes need to be in place to make sure the team is updated on others' progress and share their own with relevant stakeholders. This is especially important in terms of overall organizational strategy.
Knowledge sharing: A fourth potential issue is ensuring knowledge sharing on the machine learning side across the organization. As you spread out your ML expertise, your ML personnel will be focused on solving the assigned problem. However, even if you have decentralized teams, it makes sense to have your ML engineers and data scientists across the organization meet regularly to align and discuss solutions to common problems shared across groups. A typical example of this is the development of MLOps practices that can be used in all teams to speed up the development of AI.
Recruiting: You might also have an issue when it comes to recruiting talent, especially if you are looking to make one ML hire per team. These positions, where a single person is responsible for everything ML, can be seen as less attractive than working in a centralized ML team. However, if you have processes to solve the issues outlined above, you should make a convincing argument for why that is not a problem for the candidate.
Resource allocation: Finally, you can have a problem with where new AI initiatives go in the organization. As you have decentralized, function-oriented teams, it can be more challenging for the organization to know where new ideas should be directed and where teams without AI experience should get help.

How to choose the right structure for your company

One of the best ways to determine the right structure is based on your business maturity and the goals of your AI.

Smaller startups (Seed - Series A)

It probably doesn't make sense for smaller startups to have a centralized team as you have a limited headcount, and your focus lies in developing your product. Often, the goal is to deliver today (preferably yesterday!), so fast development and decision-making processes are of the highest importance. For most startups, a decentralized approach is preferable.

However, one exception could be if your company is building state-of-the-art AI. If you are one of these R&D-heavy startups, it can make sense to concentrate your ML expertise in one team as you often are doing exploratory work that stands to benefit the company in the coming years, not weeks.

Larger startups (Series B - Series Z)

The recommended approach for larger startups depends on if the focus is R&D or practical implementation. If you want to implement AI into your product, decentralized teams would make the most sense. If you're going to spend significant time and resources exploring new approaches to AI and then bringing them into your product at a later stage, a centralized team would be best.

Enterprises

For many enterprises, it might make sense to use both approaches outlined above.

You can have a core, centralized AI team that works on creating infrastructure, developing best practices, and evaluating state-of-the-art approaches. You can then add decentralized units to build applications and implement AI across different business functions. This approach can be very beneficial as you are getting the best of both systems, but it demands clear responsibilities between the teams to not cause friction.

There's also a third option for enterprises: creating a centralized team with development and product capabilities. Essentially, these teams work almost as separate startups within the organization and can develop AI capabilities across your products and services. This can ensure better prioritization as the team can pick which projects will benefit the company the most, not only their particular section. Of course, you need to integrate these teams exceptionally well with the rest of the organization to deliver value.

The key to team success

Whichever type of ML team structure you select, the goal is turning ML investments into business value. Neither the centralized nor decentralized model is perfect, and both have inherent weaknesses, but each also offers many opportunities to increase the speed of getting your AI initiatives into market. In the end, choosing the right framework will come down to determining what works best for your unique company and your unique AI goals – and being open to addressing issues as you go.

At CloudFactory, we’ve been supporting ML teams for 10+ years and have seen successes with both types of team structures. One key to team success is helping ML teams improve data annotation speed and quality. Our Accelerated Annotation platform is 5x faster than manual labeling and helps teams operating in both centralized and decentralized structures get models into production faster.