fbpx

Classification Task

Do you need a speedy actor in classifying your documents?

Are you currently searching a method to quickly classify your visual outputs?

Now, CaptureFast’s auto classification comes handy!

Document classification is prior to data extraction step, conducting a core factor. During data extraction step, a customer may have several fields to be managed, also different types of documents to be read. With the help of CaptureFast, you can select document type-specific fields to be extracted and arrange different document types in accordance with a wide array of classification methods.

Delving into different approaches to this step, there are two common systems which both have pros and cons.

Rule-based systems and AI (machine learning methods) tools.

AI tools are mostly used for unlimited and unstructured documents. This approach basically uses an algorithm to draw conclusions which then to be used for new data categories. Machine learning method needs at least 1000 true and 1000 false pre-classified samples to set up a new algorithm specified for required classification.

Rule-based systems are generally not error-prone but used for a limited number (less tan
1000) of document types. CaptureFast utilizes a rule-based system as its auto-classification method. This approach brings excellency and accuracy to its products. With less than 10 minute, you can conveniently add a new document into the system.

Here are the features of CaptureFast auto classification system:

QR code/barcode based classification

Glyph-based classification

Optical layout-based classification

Text existence-based classification. (must exist and/or not exist)

The distance between the text blocks-based classification.

Regular expressions

Recognition areas such as Upper half of the document, Header, Footer or the entire document.

As well as its widely-known rule-based system, CaptureFast proposes an opportunity to
use a hybrid auto classification. This new factor includes both AI and rule-based systems in one much more efficient model. Altough rule-based system is both practical and answering to needs, it still needs to be peppered with efficiency. CaptureFast engineers come up with a human touch to fine-tune the classification for each document type. With this addition, as the number of documents escalates quickly, time needed to handle each specific document can be adjusted accordingly.