|
Foxit PDF SDK
|
Public Member Functions | |
| OCRConfig () | |
| Constructor. | |
| OCRConfig (boolean is_detect_pictures, boolean is_remove_noise, boolean is_correct_skew, boolean is_enable_text_extraction_mode, boolean is_sequentially_process, boolean is_auto_overwrite_resolution, int resolution_to_overwrite, int confidence) | |
| Constructor, with parameters. More... | |
| synchronized void | delete () |
| Clean up related C++ resources immediately. More... | |
| int | getConfidence () |
| Get the confidence threshold used to determine whether the recognized text is reliable. More... | |
| boolean | getIs_auto_overwrite_resolution () |
| Get decide whether to set the resolution automatically. More... | |
| boolean | getIs_correct_skew () |
| Get decide whether to enable skew correction. More... | |
| boolean | getIs_detect_pictures () |
| Get decide whether to detect pictures. More... | |
| boolean | getIs_enable_text_extraction_mode () |
| Get decide whether to enable text extraction mode. More... | |
| boolean | getIs_remove_noise () |
| Get decide whether to remove noise of the image of PDF. More... | |
| boolean | getIs_sequentially_process () |
| Get decide whether the OCR engine will process pages sequentially on one process. More... | |
| int | getResolution_to_overwrite () |
| Get the resolution (DPI) used to overwrite the image resolution of PDF document. More... | |
| void | set (boolean is_detect_pictures, boolean is_remove_noise, boolean is_correct_skew, boolean is_enable_text_extraction_mode, boolean is_sequentially_process, boolean is_auto_overwrite_resolution, int resolution_to_overwrite, int confidence) |
| Set value. More... | |
| void | setConfidence (int value) |
| Set the confidence threshold used to determine whether the recognized text is reliable. More... | |
| void | setIs_auto_overwrite_resolution (boolean value) |
| Set decide whether to set the resolution automatically. More... | |
| void | setIs_correct_skew (boolean value) |
| Set decide whether to enable skew correction. More... | |
| void | setIs_detect_pictures (boolean value) |
| Set decide whether to detect pictures. More... | |
| void | setIs_enable_text_extraction_mode (boolean value) |
| Set decide whether to enable text extraction mode. More... | |
| void | setIs_remove_noise (boolean value) |
| Set decide whether to remove noise of the image of PDF. More... | |
| void | setIs_sequentially_process (boolean value) |
| Set decide whether the OCR engine will process pages sequentially on one process. More... | |
| void | setResolution_to_overwrite (int value) |
| Set the resolution (DPI) used to overwrite the image resolution of PDF document. More... | |
This class represents config used for OCR.
| com.foxit.sdk.addon.ocr.OCRConfig.OCRConfig | ( | boolean | is_detect_pictures, |
| boolean | is_remove_noise, | ||
| boolean | is_correct_skew, | ||
| boolean | is_enable_text_extraction_mode, | ||
| boolean | is_sequentially_process, | ||
| boolean | is_auto_overwrite_resolution, | ||
| int | resolution_to_overwrite, | ||
| int | confidence | ||
| ) |
Constructor, with parameters.
| [in] | is_detect_pictures | Decide whether to detect pictures. |
| [in] | is_remove_noise | Decide whether to remove noise of the image of PDF. |
| [in] | is_correct_skew | Decide whether to enable skew correction. |
| [in] | is_enable_text_extraction_mode | Decide whether to enable text extraction mode. |
| [in] | is_sequentially_process | Decide whether the OCR engine will process pages sequentially on one process. |
| [in] | is_auto_overwrite_resolution | Decide whether to auto overwrite resolution. |
| [in] | resolution_to_overwrite | The resolution to overwrite. This parameter is valid only when parameter is_auto_overwrite_resolution is set to false. |
| [in] | confidence | The confidence threshold used to determine whether the recognized text is reliable. The value range is from 0 to 100. |
| synchronized void com.foxit.sdk.addon.ocr.OCRConfig.delete | ( | ) |
Clean up related C++ resources immediately.
| com.foxit.sdk.addon.ocr.OCRConfig.getConfidence | ( | ) |
Get the confidence threshold used to determine whether the recognized text is reliable.
The value range is [0, 100]. The larger the value, the higher the confidence requirement. For example, if this value is set to 30, the recognized text with confidence lower than 30 will be considered as unreliable text and the recognized text will be removed. Default value: 0.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_auto_overwrite_resolution | ( | ) |
Get decide whether to set the resolution automatically.
true means the OCR engine will automatically detect and overwrite image resolution. false means set the resolution manually by parameter resolution_to_overwrite.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_correct_skew | ( | ) |
Get decide whether to enable skew correction.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_detect_pictures | ( | ) |
Get decide whether to detect pictures.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_enable_text_extraction_mode | ( | ) |
Get decide whether to enable text extraction mode.
Usually, when some parts of the text are not be found as a text block such as text on a picture or handwriting, it is recommended to set this parameter to true. It is recommended to set this parameter to false in case the complete text of a picture is recognized correctly or the sample contains images or patterns that may be considered and recognized as text. To be short this parameter enables the Engine to recognize everything remotely close to letters as text. true means to enable text extraction mode, while false means not to enable text extraction mode. Default value: false.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_remove_noise | ( | ) |
Get decide whether to remove noise of the image of PDF.
| com.foxit.sdk.addon.ocr.OCRConfig.getIs_sequentially_process | ( | ) |
Get decide whether the OCR engine will process pages sequentially on one process.
This parameter is only used in OCR conversion. true means the OCR engine will process pages sequentially on one process, and the conversion time will increase.
false means the OCR engine will detecte the number of processes automatically. If only one page is processed or there is only one processor in the system, one process is used. Otherwise, parallel processing is used.
Default value: false.
| com.foxit.sdk.addon.ocr.OCRConfig.getResolution_to_overwrite | ( | ) |
Get the resolution (DPI) used to overwrite the image resolution of PDF document.
This parameter is valid only when parameter is_auto_overwrite_resolution is set to false. Default value: 300.
| void com.foxit.sdk.addon.ocr.OCRConfig.set | ( | boolean | is_detect_pictures, |
| boolean | is_remove_noise, | ||
| boolean | is_correct_skew, | ||
| boolean | is_enable_text_extraction_mode, | ||
| boolean | is_sequentially_process, | ||
| boolean | is_auto_overwrite_resolution, | ||
| int | resolution_to_overwrite, | ||
| int | confidence | ||
| ) |
Set value.
| [in] | is_detect_pictures | Decide whether to detect pictures. |
| [in] | is_remove_noise | Decide whether to remove noise of the image of PDF. |
| [in] | is_correct_skew | Decide whether to enable skew correction. |
| [in] | is_enable_text_extraction_mode | Decide whether to enable text extraction mode. |
| [in] | is_sequentially_process | Decide whether the OCR engine will process pages sequentially on one process. |
| [in] | is_auto_overwrite_resolution | Decide whether to auto overwrite resolution. |
| [in] | resolution_to_overwrite | The resolution to overwrite. This parameter is valid only when parameter is_auto_overwrite_resolution is set to false. |
| [in] | confidence | The confidence threshold used to determine whether the recognized text is reliable. The value range is from 0 to 100. |
| com.foxit.sdk.addon.ocr.OCRConfig.setConfidence | ( | int | value | ) |
Set the confidence threshold used to determine whether the recognized text is reliable.
The value range is [0, 100]. The larger the value, the higher the confidence requirement. For example, if this value is set to 30, the recognized text with confidence lower than 30 will be considered as unreliable text and the recognized text will be removed. Default value: 0.
| [in] | value | The confidence threshold used to determine whether the recognized text is reliable. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_auto_overwrite_resolution | ( | boolean | value | ) |
Set decide whether to set the resolution automatically.
true means the OCR engine will automatically detect and overwrite image resolution. false means set the resolution manually by parameter resolution_to_overwrite.
| [in] | value | Decide whether to set the resolution automatically. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_correct_skew | ( | boolean | value | ) |
Set decide whether to enable skew correction.
| [in] | value | Decide whether to enable skew correction. true means to enable skew correction. false means not to enable skew correction. Default value: true. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_detect_pictures | ( | boolean | value | ) |
Set decide whether to detect pictures.
| [in] | value | Decide whether to detect pictures. true means the pictures will be detected during analysis process. false means not to detect the picture, the picture content on the image of PDF document might be interpreted as text. If you would like to extract only text from the image, this option can be set to false. Default value: true. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_enable_text_extraction_mode | ( | boolean | value | ) |
Set decide whether to enable text extraction mode.
Usually, when some parts of the text are not be found as a text block such as text on a picture or handwriting, it is recommended to set this parameter to true. It is recommended to set this parameter to false in case the complete text of a picture is recognized correctly or the sample contains images or patterns that may be considered and recognized as text. To be short this parameter enables the Engine to recognize everything remotely close to letters as text. true means to enable text extraction mode, while false means not to enable text extraction mode. Default value: false.
| [in] | value | Decide whether to enable text extraction mode. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_remove_noise | ( | boolean | value | ) |
Set decide whether to remove noise of the image of PDF.
| [in] | value | Decide whether to remove noise of the image of PDF. It can be useful if the image of the PDF contains some noise, such as random black dots or speckles. If the lines of letters on the image are thin, this option should be set to false, otherwise it will affect the recognition of the text. true means the noise in the image will not be recognized during the OCR process. Noise will not be recognized as text. false means not block noise. Default value: true. |
| com.foxit.sdk.addon.ocr.OCRConfig.setIs_sequentially_process | ( | boolean | value | ) |
Set decide whether the OCR engine will process pages sequentially on one process.
This parameter is only used in OCR conversion. true means the OCR engine will process pages sequentially on one process, and the conversion time will increase.
false means the OCR engine will detecte the number of processes automatically. If only one page is processed or there is only one processor in the system, one process is used. Otherwise, parallel processing is used.
Default value: false.
| [in] | value | Decide whether the OCR engine will process pages sequentially on one process. |
| com.foxit.sdk.addon.ocr.OCRConfig.setResolution_to_overwrite | ( | int | value | ) |
Set the resolution (DPI) used to overwrite the image resolution of PDF document.
This parameter is valid only when parameter is_auto_overwrite_resolution is set to false. Default value: 300.
| [in] | value | The resolution (DPI) used to overwrite the image resolution of PDF document. |