Robotic Process Automation started as a technology proving its value in automating rules-based and repetitive tasks throughout an enterprise. From legacy applications to modern complex applications RPA can automate everything that a human can do with her hands. However with the advent of technology we have come to know that traditional RPA cannot tackle a significant amount of work that may need decision making or cognitive capabilities such as predicting sales or classifying emails. Here is when Artificial Intelligence comes into play and helps traditional RPA to have new wings to be at the next level.
Consider traditional RPA as a human hand and AI as the brain. The brain and hands can achieve little when working in isolation but when working in tandem they can achieve the unthinkable and human evolution is an example of that. Today RPA software providers have come up with their respective AI offerings to leverage the benefits of this collaboration and achieve a complete digital transformation… And this can be leveraged to have an intelligently automated enterprise.
Consider how much time employees in an enterprise spend on document processing. And it gets even more challenging when the documents vary in structure.Some of them have a fixed format and are easy to read and process, like forms, passports and licenses. Some of the documents have no fixed structure at all like contracts, emails and health records. Some of the documents can be a mix of fixed and variable data like invoices, purchase orders and utility bills. This is the reason automating processes that involve these structured, semi-structured and unstructured documents has been very difficult even with traditional automation software.
RPA provides a way to process documents using the inherent AI capabilities that are an integral part of the modern day automation software. RPA provides a way to process documents intelligently with the help of AI. To learn how to process all the documents coming their way the RPA robots learn how to read and interpret documents. The robots review rules and expressions and memorize the names of the fields that need to be read and interpreted from a type of document. The robots are trained using Machine Learning to enable them to be smart to know how to approach a specific document, be it a passport, receipt, invoice or utility bill. With the help of Machine Learning the robots become aware about varying templates of documents received, handwriting understanding, signatures, check-boxes, skewed or rotated documents, various file formats and low quality scanned documents.
The human in the loop (HIL) feature allows the humans to validate the data read by the robots of which the robots are unsure. When the HIL validation happens that acts as a feedback loop to the machine learning model of the RPA robot to learn and handle that situation next time it encounters such data. This framework of having RPA robots use ML and HIL results in rapid and accurate results thus saving a lot of costs, thus mitigating the risk to avoid human error and related losses and improve customer experience. This obviously saves employees from mundane daily tasks and makes them happier and more productive by helping them focus on higher value tasks.
As you may have got it by now document understanding is the intelligent ability to extract and interpret information and meaning from a wide range of document types, storage formats and objects. Document structure can be classified into three main categories:
Documents come in many forms and shapes and are usually one of the three categories above. There is also a possibility that a file can have various parts that are structured, semi structured and unstructured and these can be based on contexts.
Based on this classification there are majorly two ways of extracting data from documents.
As a matter of fact all these methodologies can be used in a hybrid approach to fit the need of document understanding and the context.
Document understanding is a concept that involves various concepts to follow a series of steps to extract and interpret information. Optical Character Recognition (OCR) is a method of reading text from images, recognizing each character and its position in the document. OCR is used to digitize the documents that may not have native format like the scanned images. Document Understanding involves in principal 5 fundamental steps: defining document types and data to be extracted (Taxonomy), providing text and its location (OCR), classifying documents from the specified list (Classify), extracting the information (Extract) and confirming the extracted data by a human (HIL or Validate). Hence OCR is a method that helps in the digitization step of the document understanding process.
To get us in the context let’s quickly look at what various AI terms mean
The best way to understand machine learning is that we can consider it as a function that maps input to an output. We can consider machine learning as a target function that based on the previous learning can map an input to a desirable output without the need of retraining it in most of the cases. Consider the following image
Machine learning literally means machines trying to learn through experience. This is similar to a new born child learning from the environment that feeds data in her brain to automatically learn and relearn. This is called experiential learning and in case of ML it is through data that is fed to the algorithm. The software program solves a problem or predicts something by making a prediction based on the input data, which is often labelled or cleaned up to be fed to the machine learning algorithm.
The machine learning incremental learning has the following components
Depending on the way a machine learning model is trained using datasets machine learning can be classified into:
Machine learning is being used in various fields from general use, Healthcare, financial services to retail. A few of these can be seen in the following image.
As per the steps mentioned in this article ML is used to help RPA understand documents. Out of the 5 major steps machine learning is used in Classification and Extraction through applying pretrained models and having them retrain using the feedback loops in the process. The feedback loops are either facilitated by the human in the loop (HIL) feature or some king of manual validation.
Document Understanding through ML provides RPA an edge over traditional automation tools and results in:
Today’s RPA tools are equipped with built in AI and machine learning capabilities to help them understand documents that allow them to extract information and interpret according to the enterprise business process automation requirements. Document understanding may be used in areas where there is uncertainty (where we cannot determine the outcome with 100% accuracy), high variability (where the rule based automation won’t work due to high variation in the structure of the documents) and unstructured data (where information is present in emails, images, articles etc.). The use cases of utilizing document understanding range from property valuations, loan defaults, inventory forecasts, resume matching, purchase decisions, invoice extraction, email routing to language translation.