Original article was published by Jonathan Grandperrin on Deep Learning on Medium
Software developers, there is an easy way of processing invoices in your apps
Those who have once manually input data from an invoice into a spreadsheet will probably agree that it’s boring and it’s not the most self-fulfilling experience. Fortunately, a lot of software editors have recently focused on improving accounting workflows. Most businesses are now equipped with productivity apps for managing accounts payable, accounts receivable and many other document-based accounting use cases.
But if you are a software editor in this area, you are probably in either one of these situations:
- You provide your users with an interface to manually input or validate data.
- You provide your users with a seamless experience of not dealing with the document at all. In this case, you have to process the invoices internally with either a manual data entry or a business process outsourcing workflow.
What if in scenario 1, you could easily create an interface that saves your users time and energy by instantly filling in reliable information for validation?
What if in scenario 2, you could reliably automate most of your invoice flow only with technology, cutting operational costs to a minimum?
Either one of these scenarios seem too good to be true so you think you’ll never be able to do both with one solution?
Never say never and welcome aboard!
At mindee, our sole focus is to provide you with brand new computer vision technologies for document processing, so you can focus on your core missions: maximize the value of your product, and make your users happy.
How it works
We adapted the best of recent computer vision research to the field of Invoice processing and packaged it into a simple REST API.
As you’d expect, the API takes invoices as inputs (jpeg, pdfs…) and returns in one second a structured response with all the information you need (total amount, taxes, merchant ID, due date and so on).
Unlike all traditional OCR-based approaches built over the past 30 years, we do deep learning computer vision and not OCR. Believe it or not but all traditional approaches start by reading all the words on an invoice before trying to find which word contains what information. As a result, finding the information relies much more on semantics than on the image itself.
For us, it seemed like a very “inhumane” way of doing things.
That is why we built full computer vision algorithms that first locate where in the image a specific information is hidden and then read only this small subpart of the image. This brought us to totally unexplored performance territories in terms of:
- precision (>95%): a user not correcting data is a happy user.
- speed (~1s / page): did you say real time?
- robustness (latin alphabet, low quality images and pdfs): it works day one on any new customer and new geography. Did someone want to expand?
- data privacy: try us on data protection. Your data never even touches a storage hardware, and we don’t use any third party.
Last but not least, you can test it right away and we won’t even ask for your credit card nor for you to contact a salesperson.
Also, go check out our public documentation, you can find it here : https://mindee.com/documentation/apis/invoice-parsing
If you have questions about this API or your own specific use case feel free to talk to one of our experts: