3 Reasons Document Process Automation Solutions Fail with Unstructured Content

July 13, 2021 / artificial intelligence, Intelligent Document Processing, Intelligent Process Automation, Machine Learning

legacy solutions fail at solving the unstructured content challenge

 

When we talk to customers about how they can use the Indico Intelligent Process Automation platform to automate processes involving even unstructured content, we’re often met with disbelief. Responses along the lines of, “Yeah, we’ve tried tools like yours. They didn’t work,” are common. 

It’s understandable because many companies have indeed been burned by legacy automation platform vendors that talk a good game, maybe even pull off a decent proof of concept project, but ultimately fall apart in production at scale. The reason is simple: it’s because their tools aren’t based on real artificial intelligence (AI) technology, so they can’t handle unstructured content

From our experience, if a document process automation project fails, it’s typically due to one or more of the following three reasons. 

  1. Automation based on templates, not AI 

The first reason legacy automation solutions fail is because they use templates and/or rules to map out document process automation routines, an approach that only works with simple processes involving unstructured content. If you’ve got a process that involves extracting data from predefined fields on a spreadsheet, database or perhaps a highly structured form, then it is indeed a simple matter to construct a template that details what data should be extracted. In that case, you’ll likely find a templated automation tool or a robotic process automation solution will be up to the task. 

But as soon as you introduce unstructured content, such as Word documents, emails, contracts, or – heaven forbid – images, you’ll quickly find it impossible to write enough templates for the tools to be effective. The reason is simple: these tools are not AI-driven and therefore don’t have the intelligence required to “read” unstructured documents like a real AI tool does.

  1. Lack of a controlled environment

These days, you’ll find a lot of vendors that have what are at heart templated- or RPA-based process automation tools that claim to offer AI-based automation capabilities. They’ll tell you their tools can handle whatever kind of documents you throw at them. 

To “prove” it, they’ll perform a POC with maybe a couple of dozen of your documents. Behind the scenes, their trained engineers will write templates to identify and extract whatever data you define from each type of document. During the POC, in a controlled environment, the tool will likely be quite accurate. You may well be impressed. 

When you put the tool into production it will be a different story.  Now your team – not the vendor’s – has to build more templates for any document type that fall outside those included in the POC. In many cases, we’re talking hundreds of documents. You soon realize that, without an army of paid consultants helping, you can’t keep up and the vendor claims of AI capabilities now ring hollow. 

  1. Real AI – and real complexity 

Finally, you may come across a vendor that really does have an AI-driven approach to document process automation, but it comes with plenty of complexity and expense.  

Vendors such as Amazon and Google, for example, have AI-based offerings but it takes thousands of sample documents to train a model such that it has a high degree of accuracy. You’ll also need lots of data science expertise and will likely have to spend millions on the compute power to make it all work. In short, it’s an approach that’s simply out of reach for most companies. 

 

Indico IPA – real AI made simple

Indico takes a fundamentally different approach to intelligent document processing. First, we employ real AI technologies including machine learning, transfer learning and natural language processing, but we put it all behind the scenes to hide the complexity.

We’ve also already done the heavy lifting in terms of training our models. The Indico IPA platform is built on a database of some 500 million labeled data points. That gives it the intelligence to understand the context behind most any document or image, including unstructured content. 

Finally, we give you simple tools for labeling whatever documents are involved in the process you want to automate. Once you label about 200 documents, identifying the sorts of data you want to extract, you’ll have a model that’s highly accurate. Because of that massive database, our tool is smart enough to recognize, say, a social security number or an address no matter the format or where it may lie in each document – no templates required. Oh, and it’s the business people who know the processes best who use the tool to label documents; no data science expertise is necessary. 

So, how do you know Indico isn’t just another vendor making claims it can’t back up?  Well, start by requesting a demo and we’ll show you how it works in action. 

Author
indico

Don't Miss a Post

Get our best content on Intelligent Process Automation sent to your inbox weekly.