In Celaton’s world, data is anything that is submitted to an organisation. That might be facts or numbers, as per the dictionary definition, but it is also invoices, sales orders, claims and customer correspondence. Data is anything that an organisation receives, via post, email, web form or any other channel that needs to be processed and understood in order to perform a further action and gain insight. In this sense, data is fundamental to an organisation’s daily operational activities.
For an organisation looking to streamline the processing of documents with automation technologies such as RPA or Intelligent Process Automation (IPA), there are three main types of data to consider. These categories depend on the complexity of the documents for the respective technology to process and include ‘Structured’, ‘Semi-Structured’ and ‘Unstructured’ data.
Semi-Structured, on the other hand, is data that contains semantic tags but does not conform to the structure associated with typical relational databases. For organisations, semi-structured data is the most common and often found within invoices, sales orders and some forms. The data contained within these types of documents can often move around the page, for example, one supplier's sales order format may vary from another and are typically more labour intensive to process. Due to the variation present in semi-structured documents, traditional DPA solutions may struggle to process them due to their rules-based approach to data identification and extraction. For example, within an Accounts Payable process, an organisation might receive 10,000’s of invoices from a wide variety of suppliers that need processing for payment. It is challenging for traditional DPA solutions to process such a wide variety of documents at volume because of the time and cost involved with reprogramming the software with every new format received requiring an amendment to a template.
IPA platforms, such as Celaton’s inSTREAM™, enable organisations to process documents with higher variation and at high volume, because of its application of Machine Learning algorithms in a system called ‘Human in the Loop’. inSTREAM learns through the natural consequence of processing and through collaborating with human operators who teach it about each new document or exception. This means there is no need for the platform to be reprogrammed with every new document type received, not only reducing cost and time but also significantly improving process optimisation and scalability.
The final category, unstructured data, is defined as having no predefined format and is typically text-heavy and written in the human voice. This makes it much more difficult to collect, process, and analyse. Organisations tend to receive unstructured data within customer correspondence and in some claims. As such, this data is often linked to customer experience and so delays in processing can impact a company’s reputation and potential competitive advantage.
The complexity of unstructured data means IPA platforms can be applied to process these documents because of the use of Machine Learning and ‘Human in the Loop’. However, it is important to note that in some instances, manual processing may still be required because of the complexity of the document, for example, it may be handwritten, or learning is limited if low volumes are received. In these instances, it may not be cost-effective to deploy Intelligent Automation technology as it can be difficult to achieve ROI.
In conclusion, despite how broad and confusing the terminology surrounding data might be, organisations can no longer ignore the important role effectively processing data has on business success. Through understanding the different types of documents and data contained within them, organisations can better identify the most effective technology applications for their processes and achieve sustainable long-term efficiencies.