Managing Tesseract Models

Smart Vision uses default Tesseract language models for OCR. Platform offers the capability to improve the default Tesseract model for any specific field in a document through ground truth value training. This improved Tesseract model can be used in other structured document extraction projects.

To create a Tesseract project, perform the following:

  1. Login to SmartOps.

  2. Access Smart Vision from SmartOps home page as shown in FigureFigure.


    This displays the Smart Vision project listing page as shown.


    You can create a new project, view and manage projects, pretrained models and tesseract models.

  3. Click Tesseract tab. This displays the Tesseract screen as shown in FigureFigure.

  4. Click  to create a new project. This displays the Build Project screen as shown in FigureFigure.


    Build tab of Build Project screen is displayed by default.

  5. Configure the basic attributes for creating a new project as explained in the table below.

    Field

    Description

    Project Name

    Name of the project.

    Project Description

     Brief description for the project.

    Project Access

    Access privileges for the project. Values in the list are:

    • Private

    • Public

  6. Click Create. You will be navigated to Upload tab as shown in FigureFigure.

  7. Follow the steps mentioned below for Template Driven document extraction.

    1. Select the required documents (golden template) by drag & drop or selecting from a path.

    2. Click Add Form. The forms are successfully uploaded and the screen is refreshed as shown in Figure.Figure.

       

    3. You can update or add new documents to the form.

    4. Click Next. You will be navigated to Contour tab as shown in FigureFigure.

       
    5. You can mark the Fields as per requirement from the document preview as shown in FigureFigure.

    6. Enter the marking name in the Tesseract Marking window and click Save.

    7. Select the field from Field tab Edit the ground truth value in the Configuration tab as shown in FigureFigure.

    8. You may hide markings and draw freely using Hide Marking and draw freely text box.

    9. You may also mask input image except the configured fields using Mask input image except below Fields text box.

    10. To apply marking to other documents, click on  icon and select the required documents to be applied and click Proceed.

    11. To mark the field as "to be verified against the image and make required corrections", click on icon.

    12. To delete all markings for this label, click on  icon.

    13. To delete all markings and fields, click on icon. 

  8. Click Next. You will be navigated to Publish tab as shown in FigureFigure.

  9. To publish the project, click on Publish

Related Topics:

Feedback

Copyright © 2021 UST Global. All Rights Reserved.