Document Understanding

Document Understanding consists of following features or enhancements:

Contents

  1. Provision to support multiple OCR engines  
  2. Architecture Upgrade 
  3. Show detailed error messages for projects which are not ready

Provision to support multiple OCR engines  

In DU, users will have the provision to choose any one of the available OCR engines based on customer needs. Default OCR engine is Tesseract and other options include OCRmyPDF and Azure OCR. The configuration for additional OCR engines needs to be done during deployment.

To create a project with the required OCR engine, follow the steps mentioned below.

  1. Login to DU.

  2. To create a new project, click on Create New Project. This displays the Create Project screen as shown in FigureFigure.

     

  3. Enter the basic details related to the project in Project Details tab:

    • Enter the name of the project in Project Name field.

    • Enter a brief description in the Description field.

  4. Select the Document Type as required.

  5. Click Save to save the Select Type configurations.

  6. Select the required classifier from the Select Classifiers block. Available options are CMS Document Classifier and Language Classifier.

  7. Click Save to save the Select Classifiers configurations.

  8. Select the required option from OCR Configurations block as shown in FigureFigure

  9. Click Save to save the OCR Configurations configurations.

Architecture Upgrade 

The scalable DU architecture broadly includes four layers – AI components, DU , PWF and SmartVision

  1. AI Components - Train, test and infer AI components. Includes Classification, Extraction, OCR, Image/Document Enrichment/Correction

  2. DU - Reduce pipeline to allow DU to get insights for any specific document type. Seamless integration with AI Components and global model library

  3. PWF - Document Ingestion, document enrichment and classifications to be handled by the workflow. Supports customer specific configurations, pre-processors and post-processors. Allows seamless integration with AI components and DU and pre-built insights workflow

  4. SmartVision - Create a business solution using a combination of workflows. E.g. Claims Settlement, Loan Processing, Exploratory Search, Document Linking etc.

This will enable timely customer onboarding as well as allow other teams to contribute post processors, classifiers, extractors etc.

Show detailed error messages for projects which are not ready

Going forward, DU will display the following messages (in detail) when the project is not ready.
•    Document Type Classifier is not selected
•    Field configuration is not complete
•    Incomplete Project Configuration

 

 

Feedback

Copyright © 2021 UST Global. All Rights Reserved.