Tech Talk: Efficient invoice processing with OCR automation

Invoices are at the heart of the business, whether B2B or B2C. Every company has to deal with invoices in various volumes from vendors.

For our clients and their accounts payable teams, invoices were both a resource and a hurdle. Their challenge was treating numerous PDF files and entering invoice details into their system by hand, an extremely time-consuming and redundant task.

We set out to construct software able to automate and expedite this task using optical character recognition (OCR) technology. In other words, the ability of AI to recognize text present in documents. Various industries that deal with a lot of data and document processing should find this technology helpful for many use cases.

Existing tools

On the market, there are various existing projects for invoice treatment. We decided to explore this first to assess the leading available tools. Despite several projects, we could not conclude on one tool that could simultaneously show:

  • a sufficient level of reading accuracy,

  • a sufficient level of template flexibility,

  • and a sufficient level of customization to meet our client’s needs.

To answer this need, we developed our very own solution.

The OCR solution

We developed a tool to treat invoices received as a PDF or an image. The first obstacle was the ability to extract the exact field needed, which could be of various forms. Think of a date, for example. It can have multiple formats, such as dd/MM/YYYY, dd/mm/YY,  YY/mm/dd, or written out with letters. We started by defining regular expressions for each field to tackle this problem. But that was not enough. You could have different dates on invoices — birthdate, invoice date, due date, reminder date, etc.

We decided to minimize this issue by including a second logic layer. We can assume some rules for all dates identified to sort and label date candidates. For example, if the date identified is the oldest and is decades from the current date,  you can assume it is the birthdate.

Once we prepared this initial action, we found our result not to match the level of accuracy we wished. Now, imagine you could manually indicate on the invoice where the birthdate is. At this point, you have the coordinates, the regular expression, and the logic rules to detect the birthdate on a file of hundreds of words and sometimes several pages. Apply this logic to each field needed, and you obtain a model that can read almost any invoice with sufficient accuracy.

Coordinates

However, the problem is that our OCR model relies on access to the coordinates of each field. We still need to identify and structure which piece of text is within which extracted field. Does it mean that for every invoice, we still have to register every coordinate for every field manually? Fortunately, no.

Our model is user-friendly for field identification. It will simply display the image and allow the user to click and drag on the specific area. Let’s say you have ten fields to identify. You repeat this ten times. Then, for every invoice, our program will recognize its template and check if it is referenced in our existing DB. If so, the users will have nothing more to do. Otherwise, they will have the possibility to register the template using the previously described method. Below is an example of how we can identify the coordinates in the model:

Results

The results from our model are displayed in an Excel file with an accuracy score and various warning flags.

Our accuracy score depends on the quality of the PDF scan, the weighted average of OCR quality scoring, and the presence or absence of invalid characters. For example, numbers in a name. Warning flags allow for quick human supervision. Our warnings include scan quality, identification of potential non-coherent fields, or invalid amounts. In this case, we included a flag for outliers' total amount compared to the batch files distribution.

Finally, this excel can be reviewed by a person for possible incoherence. The main added value of our OCR project is its time-saving abilities. Yet, automating invoice processing can also reduce costs and eliminate errors, resulting in fewer hurdles and higher productivity.

What should have taken hours can now be launched and running while you work on another project or take a well-deserved coffee break to read another Agilytic article!

Previous
Previous

Team Quest: Developing a cutting-edge solution in just 9 hours

Next
Next

Internship reflection: Maxime’s experience at Agilytic