The AI & Automation Must-Have: Humans-in-the-Loop

Despite advances in machine learning (ML), artificial intelligence (AI), and automation, there remains the need for human participation within the decisions and processes we seek to automate. In my recent blog series on exception processing, I noted that exceptions are so common that all ML experts should develop a plan—on every project—to address them.

I also sat down with these three industry experts to discuss how best to combine the strengths of humans with the power of automation:

Dean Abbott, Chief Data Scientist at SmarterHQ
James Taylor, Founder and CEO of Decision Management Solutions
Ian Barkin, Chief Strategy and Marketing Officer at Sykes

The common theme throughout each discussion was that many perceive the need for human-in-the-loop (HITL) intervention as a flaw in the system. While it is true that the developers of these systems try to keep the number of exceptions to a minimum, a prudent design will both appreciate and anticipate the need for humans.

Dean Abbott and the Confidence of Machine Learning Models

During my discussion with Dean Abbott, he coined a phrase, “squishy middle,” that has proven quite popular. He might not have intended it to become a paradigm example, but it has become an apt description of instances when a ML model isn’t sure which outcome is most likely.

How do you best handle this uncertainty or ambiguity? He and I discussed this in the context of a particularly interesting case study, a seemingly mundane processing of payouts for truckers’ invoices. The company described would buy invoices from truckers and trucking companies at a discount. The truckers would get paid more quickly, while the company made a little profit on the difference. Naturally, there was an approval process for deciding which invoices to purchase that could be automated.

This was an interesting application from a relatively small, local company in San Diego. Let’s say a trucker has a delivery to Walmart. They deliver cargo and Walmart pays the invoice net 30, but the trucker needs the cash earlier because they're out that money. This company buys invoices and holds back a small percentage of the invoice to cover its expenses. The idea was this: can we build machine learning models to identify whose invoices are good to purchase and whose are not? And obviously, what we would love is if you could completely automate this.

Like most projects of this kind, the machine learning model was indeed able to identify some cases that were extremely likely to be safe invoices to advance, and others that were very high risk that should be rejected.

Then you've got the squishy middle, those with 30% to 70% success rates. These are the ones that really need human intervention because obviously there's additional data that is available or that only humans can infer, that may not be readily available in digital form. In these cases, the ML model reduces the workload so the company can process more invoices—addressing the ones that are easy, while allowing humans to focus on the ones that are hard.

That was the most noteworthy example that we discussed, but Dean or I could have shared similar stories for hours. It is in the very nature of machine learning models; sometimes they are certain of an outcome and sometimes they are not.

In many cases, a human can insert an additional piece of information—usually something that humans are particularly apt at processing like images or language—and they can give the machine learning model what it needs to be certain. Notice that this isn’t a human-as-an-override process, rather, it is having a human provide the missing piece of information or the human-in-the-loop. This distinction was particularly important as we discussed decision management with our next industry expert, James Taylor.

James Taylor and Efficient Decision Management

One of many powerful pieces of advice from James Taylor is that humans should be utilized to provide better inputs to models, not override the models. This distinction is an important one because if the human is going to be an override, they need to incorporate all of the known information and they need an extremely high level of subject matter expertise. At one point in our discussion, James made this clear using the case of a top surgeon in his field.

A highly experienced surgeon can size up a patient’s likelihood of surviving surgery by meeting them and looking over their appearance. That kind of expertise is a rare commodity. It should be reserved for borderline cases because, as James explained, if the risk of sudden death without surgery is very high, then surgery should occur immediately regardless of appearance.

There are even broader implications here. If the decision management system is designed properly, someone with a focused specialty (but without extensive overall expertise) can be of assistance. This was a particularly interesting part of the conversation because it described exactly the kinds of situations where CloudFactory has been successful.

James elaborated on this point while discussing a disability claims example, and how it could be broken down into a series of tightly focused micro-decisions that do not necessarily all require deep subject matter expertise.

Nor do you need an employee to do this. Once you break the problem down, you're not trying to decide whether to pay the claim, you're trying to make a decision that informs the decision about paying the claim, a much more focused decision. As long as you think the issue is that you have to decide whether to pay the claim, then you’ll of course think you have to use an in-house data science team to review all the data, because making the decision to pay the claim is a big business decision; it's important. But, in this case, the first decision isn’t about paying the claim. It’s more focused, more micro. There’s a random medical report about someone. You don't even need to know who. You just need to know that this person is making a claim for a broken leg. Then you just need to answer if the doctor is saying they have a broken leg or not.

This is incredibly powerful advice because it means that you don’t have to delay a calendar year (or more) trying to get something very tricky working like text analytics. And you don’t need an MD or other expert. You can have an external trained and managed workforce provide this piece of information, when needed. It’s the optimal combination of employees, business rules, machine learning models, and an external workforce.

Ian Barkin and Why Deterministic Models Still Require Exception Processing

Based upon my conversations with Dean and James, it would perhaps seem that the challenge of automation is dealing with the uncertainty that you face. Evaluating trucker invoices and the disability claims process each had an element of uncertainty.

One way to think of Ian Barkin’s specialty in robotic process automation (RPA) is that he has many of the same deployment and organizational challenges, even when there is certainty in the process or task being automated. As we discussed, there are applications of RPA both with and without predictive models. But even when he doesn’t have the predictive model, most other elements are the same.

In the realm of RPA, it might be onboarding an employee, which can involve five or even 5,000 steps. But these are still defined steps in a user manual. You just need to digitize them; you need to just emulate the step that a human had been doing and have software configured to do those steps.

So, does RPA eliminate the human in the process? It would seem so, but wasn’t that also the case with machine learning and decision management? In reality, the human isn’t eliminated from any of these scenarios. And while RPA is often deterministic (and doesn't involve uncertainty), there are times that the system doesn’t get all the information that it needs to complete the process or task. In that sense, it is almost exactly like the situation that James described with the disability claims. In fact, Ian had a similar example, this time involving money laundering.

Anti-money laundering is a very complex event that happens enough times to be a problem, but not so much that there's enough big data to really train algorithms. It's certainly more complex than just simple RPA, although there are some components of an anti-money laundering process that are still digitizable with RPA. The other element is that once you get into some more complex machine learning, you're not able to as-reliably audit and sign off on why the algorithm caught the fraud. In a highly regulated industry, like banking, you can't go to the regulator and say, well, we prevented this person from making this transaction or opening this account because our ML algorithm identified this fraud accurately, we just don't know why it was accurate.

With the growth of advanced “black box” algorithms like deep learning, there has been a parallel growth in what is called explainable AI (XAI) to try to get around the issues that Ian is describing. So, he faces a double dilemma: not enough data for advanced algorithms and a reluctance to create opaque models with advanced algorithms. Ian believes that this is where the subjective human input is still sometimes required, not unlike James’ example, and where an external, managed workforce such as CloudFactory can be trained to handle effectively.

Some of it is around patterns of events that seem unlikely where humans are just able to pick them out...just knowing something is fishy but isn't necessarily explainable.

Ian and I shared a laugh around our “squishy” and “fishy” adjectives during the interview, but this is ultimately serious stuff because combining automation and people helps effectively optimize decision-making

What all three of these industry experts have in common is realism around what automation can do, what it sometimes fails at, and a systematic approach to taking advantage of its strengths—all while respecting how critical humans are to the process.