Smart Vision v7.2-Release Description

Contents

  1. DU Enhancements
    1. Enabling pipeline configuration in DU
      1. Hardware Requirements
      2. Accuracy Benchmarking
    2. Support customization of VESPA pre-processors
      1. Feature List as of 7.2
      2. Pre-trained models
  2. Smart Vision Package Workflow
    1. Provision to configure Users for Auto allocation
    2. Provision to persist custom layer data in PWF
    3. Review screen enhancements to improve productivity of reviewer

DU Enhancements

Enabling pipeline configuration in DU

Document Understanding (DU) platform now provides you the flexibility to choose the required pipeline during project set up.  This helps you to optimize the hardware requirements depending on the customer needs and thereby offer competitive prices. 
DU includes two active pipelines VESPA & SmartExtract and both of them are capable of extracting Insights & Table fields of interest from the documents.

However to get maximum advantage the recommended usage is as below.    

  1. Customers dealing with Forms type of documents which contains similar information, structured/semi-structured data, labeled values, tables should use SmartExtract pipeline. (Document type examples - Invoices, Purchase Order)   

  2. Customers requiring long text extraction from documents with paragraphs & sections should use VESPA pipeline. (Document type examples – Contract/Legal documents)   

  3. Customers dealing with both types of documents should use a combination of VESPA & SmartExtract pipelines.

Screenshot highlights "Extraction Pipeline selection" section in DU.

Hardware Requirements

<To be added>

Accuracy Benchmarking

<To be added>

Support customization of VESPA pre-processors

VESPA Field configurator feature was introduced in 7.1 release which enables user to add new fields/modify existing fields depending on the customer requirements. This feature is enhanced to improve the flexibility of this feature by including the provision for custom pre-processors.

Pre-processors helps to fine tune the extraction process and thereby yields better accuracy. This feature allows you to view the available pre-processors and modify/add new ones to suite the specific requirements of the customer. This is applicable for VESPA pipeline. 

Feature List as of 7.2

For feature list as of 7.2, Click hereClick here.


SL.No

Area

Product Feature

Feature Description

Release Version

1

Document Intake

Reads images & PDF documents

Reads from TIFF, PNG image formats & PDF documents

6.4.3

2

Support direct upload

Supports direct upload of files, emails, FTP

6.4.3

3

Prepare Documents

Digitize documents 

Uses Tesseract for OCR

6.4.3

4

Document Pre-processing

Orientation Correction

Identifies and corrects the orientation issues related to the document for better accuracy in text extraction.

6.4.3

5

Grayscale

Converts the documents to grayscale. Greyscale image is a kind of black and white or gray monochrome, composed exclusively of shades of gray. The contrast ranges from black at the weakest intensity to white at the strongest.

6.4.3

6

Binarization

Converts the document to binary. A binary image consists of pixels that can have one of exactly two colors, black and white. The process is based on the parameter “Threshold”. Valid values are within the range of 1 – 255 and any pixel above threshold value will be converted to white and others to black.

6.4.3

7

External REST Call 

Invokes a REST API service to process the document.

6.4.3

8

Handle secure PDFs

Ability to handle secure PDF documents

7.0

9

Document Recognition

Splitting of bundled documents 

Ability to extract individual invoices from zip files

6.4.3

10

Document Split - Scan page

Ability to split invoices based on the scan page

6.4.3

11

Document Split - Page no:s 

Ability to split invoices based on page no:s

6.4.3

12

Language Classifier

AI model to classify English vs non-English documents

7.1

13

CMS Classifier

AI model to classify MSA, SOW & Addendum documents

7.0

14

Document Extraction

Header fields

Ability to extract data from header fields

6.4.3

15

Tables - Single & Multi-page

Ability to extract data from single & multi-page tables. 

7.0

16

Long Text & formated fields

Ability to extract long text & formatted fields

7.2

19

Document Types

Extraction of Retail Invoices, SOW, MSA, Addendum, Annual Reports, PAN card is supported

7.0

20

Document Post-processing

Document Linking (available for CMS)

Ability to link MSA, SOW & Addendum documents

7.0

21

Exploratory Search (available for CMS)

Ability to search across linked documents

7.0

22

View

View Extracted information

GUI to review the extracted information (inclusive of candidates wherever available)

6.4.3

27

Configure & Train

Pipeline configuration

Flexibility to choose extraction engine - Vespa/SmartExtract

7.2

28

Update FOI configurations

Provision to modify config file associated to an FOI

7.1

29

Configure new FOI

Provision to add new FOIs for data extraction by providing the appropriate configurations. 

7.1

30

Configure new document type 

Provision to add a new document type and corresponding FOIs by providing the appropriate configurations. 

7.1

31

Configure new pre-processors

Provision to attach appropriate pre-processors to support data extraction

7.2

32

Update pre-processors

Provision to update pre-processors to support data extraction

7.2

33

Feedback based learning

Ability to utilize EITL feedback for training the model and using the trained model for predictions

7.2

34

Export

Excel download

Excel based download for extracted fields

7.0

35

NFR

Data Archival

Archive data based on the retention period

7.0

 

Queue management

Provision to manage document extraction from multiple channels via routing keys

7.0

36

Optimizal hardware configuration

DU hardware requirements can be optimized by deploying the required pipelines (Vespa or Smart Extract) as needed. Helps to offer competitive pricing to customers.

7.2


Pre-trained models

EITL model trained with Navistar production invoices.

<To be updated>

Smart Vision Package Workflow

Today, with improved DU capabilities we are catering to document types beyond invoices and there is a compelling need to have a generic PWF that can support multiple types of documents.

SmartVision PWF is introduced which offers you the basic review & approve capability for any document type.

Note: In fact, SmartVision PWF is a generalized version of Invoice Extraction PWF and includes all the functionalities which is currently available in IE PWF. 

Personas – SVision Supervisor, SVision Reviewer, and Installation Engineer.

Document Types – Any (as set in the attached DU project)
Functionalities:


Additionally, non-functional requirements supported today will continue with SmartVision PWF.  

For feature list as of 7.2, Click hereClick here

SL. No

Area

Product Feature

Feature Description

Release Version

1

PWF Project Creation

Document Import Configuration 

Provision to configure inbound FTP location & execution schedule

7.2

2

Auto-allocation Configuration

Provision to configure
1) Execution schedule
2) Daily threshold limits (mix & max page counts)
3) Criteria to prioritize incoming Documents (based on file name)

7.2

3

Auto-allocation Configuration

Provision to configure users to participate in auto allocation

7.2

4

Export Configuration

Provision to configure outbound FTP location & execution schedule

7.2

5

FOI configuration

1. Option to choose from available DU projects in same region
2. Display all FOIs configured in DU and allow to configure fields requiring review

7.2

6

Configure FOIs for manual data entry

Provision to configure additional fields for manual data entry. (Data extraction not supported)

7.2

7

Configure tables for manual data entry

Provision to configure additional tables for manual data entry. (Data extraction not supported)

7.2

8

Field Validations 

Provision to configure basic field validations (data type validations) for DU extracted and manual fields
Provision to attach validation messages to be displayed during review

7.2

9

Field Transformations

Provision to subset/transform extracted data

7.2

10

External Service Configuration 

Provision to configure external service to enable business validations from preview screen. [eg: validations against master data, multi-field validations etc]

7.2

11

Personas

Provision to configure personas
- Installation Engineer
- AP Supervisor
- AP Clerk
Role based access restrictions

7.2

12

Document Listing Dashboard

Document Summary View

1. Cards to display the count of Documents by status
- Waiting for Approval (default view)
- Approved/Corrected Approvals
- Rejected
- Escalated
- Sent
2. By default, Waiting for Approval tab will display all the pending Documents. Rest of the tabs would display Documents with the respective status for the batches received today.
3. Automatic refresh of cards to display latest status 

7.2

13

Document Listing by batches

Separate tabs available for each Card
Each tab displays batch & Documents with respective status
Documents within a batch are sorted by file name

7.2

14

User Details View (Collapsible)

Quick view of Document distribution @user level available for AP Supervisor

7.2

15

Filter by User

Provision to filter for unassigned Documents/specific user from User view.

7.2

16

Filter by Datetime

Provision to filter the Documents by batch run date & time.
Calendar supports in-built controls like Today, This Week, This Month, This Year, Last Year.

7.2

17

Search Documents

Provision to search Document by full/part of file name (Contains)

7.2

18

Multi-select

Provision to select multiple Documents for Assign/Delete operations

7.2

19

User time zone based display

User time zone based display in Dashboard and other screens

7.2

20

Document Allocation

Auto-allocation  

Provision to automatically distribute Documents to AP Clerks.
Alerts to Supervisor notifying the status of allocation

7.2

21

Assign Documents

Provision to assign Documents to AP Clerks

7.2

22

Document Review & Correction

Review Documents

Provision to view
1. Insights & Line item extracted data
2. Manually added fields & tables
Invoice fields are displayed in preview screen in a specific orderto help BPO user to focus on the most commonly entered fields. 

7.2

23

Split & Original Document view

Provision to view Original & split Document files

7.2

24

Document navigation

1. Provision to navigate across Documents within a batch
2. Display the Document file name & position within the batch

7.2

25

PDF Highlighting

Highlight value of field in selection in the previewed document for reference

7.2

26

Auto scroll toggle option

Provision to turn on/off auto scroll 

7.2

27

Line item dock options

Ability to dock the line item section at bottom or to right

7.2

28

Split correction

Ability to correct the split PDF in case of errors; user can specify individual page no:s or a page range or a combination of this.

7.2

29

Candidate display

1. Provide value suggestions for every extracted field based  on model prediction
2. Auto-select suggested value with highest confidence score by default

7.2

30

Edit Document data

1. Ability to
- Edit Document data
- Select from candidate list
- Add new rows in table
- Delete all rows in one go
2. Edit is restricted on "Sent" Documents to maintain data integrity

7.2

31

Identify Duplicate Invoices

Provision to identify duplicate Invoices (Invoice number & Vendor code combination)

7.2

32

Tabular Data Extraction

Improvements to tabular data extraction using SmartExtract engine

7.2

33

Table/Column Rebounding

Provision to extract tabular data based on user inputs. User can
- rebound table boundaries
- split/re-adjust columns
- draw new tables
System would extract data based on the drawings provided by user.

7.2

34

Point & Crop feature

Point & Crop feature to copy data from PDF in one click and thereby eliminate manual key-in by the user. Applicable for all types of documents under Finance domain.

7.2

35

Save Document

Provision to save work in progress

7.2

36

Shelve Document data

1. Provision to save multiple versions for work in progress
2. Display the available shelves
3 .Others would be able to view the versions and use as needed
4. Provision to fall back to system extracted data if user edits needs to be discarded

7.2

37

Approve Documents

Ability to save & approve reviewed Documents

7.2

38

Exception Processing

Add New Documents

Ability to split an Document record from original file and enter Document details manually.

7.2

39

Reject Documents

Ability to reject Documents and attach a reject reason

7.2

40

Escalate Documents

Ability to escalate Documents and attach an escalate reason

7.2

41

Delete Documents

1. Provision to delete incorrect Documents from Waiting for Approval, Approved, Rejected, Escalated tabs.
2. Restriction to delete Sent Documents to maintain data integrity

7.2

42

Export Documents

Sent Documents for downstream processing

1. Ability to send completed batches for downstream processing. Batch will be send when all the Documents within the batch is approved/rejected.
2. Sent Card includes the count & details of exported Documents
3. Provision to search on history of exports done by users
4. Provision to filter exported data by date range and user

7.2

43

Preview option for Documents ready for export

Ability to preview the Documents ready for export from custom layer

7.2

44

Reporting

Advanced Search for Reporting

Provision to filter batches/Documents based on the status, datetime, user.
User can built the search criteria; concatenation supported up to 3 levels.
Provision to include history records in the search
Minimize & Maximize options for Advanced Search window

7.2

45

Excel Download  

1. Provision to download the Documents (header & line item data) from each card
2. Provision to download the advanced search results into excel

7.2

46

APIs for custom reporting

APIs from Advanced Search & Accuracy Analytics sections

7.2

47

Advanced Analytics

Accuracy Analytics

1. Display accuracy indicator in listing & review screens for approved Documents. 
- Green for non-edited fields
- Red for edited fields
- Yellow for fields edited based on suggestions
2. Accuracy graphs to display average accuracy per field (Insights & Line items)
3. Drill down graphs to show the OCR accuracy of the document used for extraction
4. Manual fields are excluded from analytics

7.2

48

Time Analytics

1. Displays average review time per Document and average extraction time taken per Document.
2. Drill down to display average review time per Document taken by each user

7.2

49

Filter by date 

Provision to view the analytics data for a given duration

7.2

50

NFR

Audit Trail

Maintain audit trail at Document level for the state transitions.

7.2

51

Provision to plug in multiple data sources 

Provision to override DU extracted data from other data sources.

7.2

52

Data Archival Framework

1. Purge & Archival mechanism to systematically remove processed Documents
2. Configurable retention period
3. On-demand restore options

7.2

53

E2E Traceability View

Runlist page displays all the processes from FTP file intake to export of Documents.
1. Cards display the overall batch processing stats
2. Detailed section includes
- FTP job details
- DU Stats (details of Success/In-progress/Failed Documents)
- Auto-allocation job details
- Document status within PWF
- Sent job details
3. Success/Failure flagging for inbound & outbound process in Runlist & Document listing pages
4. Hint to user on possibilities of mismatch and next steps

7.2

54

EITL Model Training

Ability leverage user feedback for training the prediction model and utilizing the trained model for improved predictions.

7.2

55

Document lock-out

1. Lock the Document while user is editing in Document preview screen.
2. Lock-out duration is configurable
3. Automated refresh while user is active in Document preview screen

7.2

56

Application time-out

Graceful exit when user is inactive
Time-out duration is configurable

7.2

57

UI Mono repo

Single UI repository for Document PWF enabling re-use

7.2

Provision to configure Users for Auto allocation

SmartVision PWF is enabled with automated distribution of invoices capability. The PWF is enabled with the provision to configure automated distribution rules for invoices. The PWF enables automated distribution of invoices to users, and also automated alerts to Supervisor user. This gives the flexibility to add/remove users from auto allocation process depending on the business needs. 

Provision to persist custom layer data in PWF

SmartVision PWF is enabled with the capability to persist custom layer data in PWF and use in conjunction with system extracted data.

Review screen enhancements to improve productivity of reviewer

SmartVision Review screen is enabled with following enhancements.

 

 

Feedback

Copyright © 2021 UST Global. All Rights Reserved.