Bringing Intelligent Process Automation to Financial Document Analysis
April 2, 2020 / Financial Services, Intelligent Process Automation, Machine Learning
The investment industry relies heavily on data and has long been a leader in using technology to collect, analyze and distribute financial data. It may be surprising, then, that automation has largely escaped the process of culling actionable data from an investment staple: SEC quarterly earnings reports (specifically, 10-Q and annual 10-K forms).
The reason, however, is understandable. It’s because 10-Q and 10-K forms consist largely of unstructured data – a collection of free-form text, tables and numbers that are not easily machine-readable. Yet, the forms are crucial to financial analysts and others who need to assess the performance of companies in order to give investment advice and make decisions.
10-Q and 10-K forms, then, present an interesting case study in the pros and cons of different approaches to process automation in financial services. (For the sake of simplicity, from here on we’ll refer just to 10-Q documents, but the same applies to 10-Ks – and other financial documents.)
Accelerating the financial analysis process with automation
Traditionally, there’s been little automation on the preliminary stages of the 10-Q analysis process. Instead, financial analysts (or someone working on their behalf) pore over hundreds of pages of the 10-Q documents looking for pertinent language, numbers and other details relevant to the company’s performance. Those details may include a “material change” to the business, particularly related to income and expense levels; gains or losses; gross margins (including by region if applicable), and more.
Current workflows require analysts to manually extract and normalize these variables, then enter them into a spreadsheet. Only at this point, with the data in a structured form, do firms begin automating the rest of the process. The structured data is fed into a downstream analytics platform that crunches numbers, enabling analysts to compare companies to one another and make decisions based on the data.
It’s clear that the extraction and normalization phase is quite labor-intensive. As financial process automation tools have proliferated in recent years, firms are increasingly looking for any that can tackle these 10-Q analysis pain points.
Some have tried a templated approach to reading the 10-Qs. As we’ve explained in previous posts (such as this one), that requires a process expert to define the values that must be extracted and the exact position for each value within a 10-Q document. The problem with that approach is that 10-Qs are full of free-form text discussing the performance of a given company. It’s unstructured. No two are exactly alike. That makes it impossible to write rules to capture all the information you’re looking for. How do you write rules for every synonym, connotation, and other variation that’s characteristic to human language?
Adding intelligence to financial process automation
Intelligent process automation (IPA) offers a fundamentally different approach. By taking advantage of technologies including optical character recognition, machine learning, transfer learning and natural language processing, intelligent automation tools can read and understand documents much like a human would. The automation tool is armed with a baseline of words, concepts and an overall understanding of human language which it can then apply to different documents. It understands context. Rules do not.
In practice, let’s say an investment firm wants to examine material change clauses for a series of 10-Q documents. The process may vary from one intelligent process automation (IPA) tool to another, but we’ll walk through the workflow we’ve designed for this use case. To start, our IPA tool would first classify sections or paragraphs in a series of 10-Q documents that contain material change clauses. This is important because analysts are faced with thousands of pages of information, so rapid filtering is necessary.
Then, someone who understands the process – perhaps a financial analyst or process expert – would go through a few of those sections and label particular phrases and data points that need to be extracted. The tool enables the expert to assign categories to different pieces of information, such as amount, time period, text that describes the material change, and so forth. It seems counter intuitive that some manual work is still required here, but this labeling exercise simply allows your subject matter expert (SME) to customize the machine learning models that power the IPA tool.
After labeling only about 50 sample excerpts, the analyst can then tell the tool to start analyzing documents on its own. The analyst monitors how the tool performs, accepting and rejecting the predicted labels as required. The tool also indicates confidence levels for each prediction. Ultimately, this arms your SME with insight into how well the tool is performing, ensuring they’re not just working with a black box.
IPA delivers process automation to financial services
In no more than an hour or two, an analyst can have a working model ready to go. But that’s only if the financial process automation tool has that vast baseline of words and concepts it can bring to the topic. Without that, it would take hundreds of thousands of 10-Q documents to properly train the model from scratch. And to attempt to analyze 10-Qs by writing rules? Well, we wouldn’t try.
Intelligent automation adds another weapon to the vast technology arsenal that financial services and investment companies are known for. It can help you solve a longstanding problem and bring true automation to financial document analysis, including SEC filings.