Sharp AR-NS2 (serv.man2) User Manual / Operation Manual ▷ View online
Chapter 8
Converting Images to Text
Introduction
Sharpdesk lets you convert a non-editable, graphical image containing text into a file that can be
edited with your favorite word processor. You can convert an image at anytime while working in
Sharpdesk by simply dragging the image onto the Convert by OCR option on the Output Zone
bar. With Sharpdesk, even your document layouts are preserved.
Once an image has been turned into an editable document, you can then change it, annotate it,
and treat it like any other document you created from scratch in its native application.
You can convert any .TIF, BMP, .DCX, .JPG, or .PCX, image into one of many standard output
formats.
Keep in mind that text conversion accuracy depends on the quality of the original image. A poor
quality fax or copy might not convert correctly as the engine will have difficulty reading the text
characters. On the other hand, most laser-printed documents should convert just fine. For tips on
getting the best results, refer to “Text Conversion Tips” later in this chapter.
edited with your favorite word processor. You can convert an image at anytime while working in
Sharpdesk by simply dragging the image onto the Convert by OCR option on the Output Zone
bar. With Sharpdesk, even your document layouts are preserved.
Once an image has been turned into an editable document, you can then change it, annotate it,
and treat it like any other document you created from scratch in its native application.
You can convert any .TIF, BMP, .DCX, .JPG, or .PCX, image into one of many standard output
formats.
Keep in mind that text conversion accuracy depends on the quality of the original image. A poor
quality fax or copy might not convert correctly as the engine will have difficulty reading the text
characters. On the other hand, most laser-printed documents should convert just fine. For tips on
getting the best results, refer to “Text Conversion Tips” later in this chapter.
Conversion Capabilities
The Sharpdesk conversion engine can recognize text characters in the following languages:
English French
German
Italian
Spanish Portuguese
Swedish
Dutch
Standard Output Formats
The following output file formats may be selected for the Convert by OCR process:
Sharpdesk User’s Guide
71
Format File
Extension
AmiPro .sam
Comma Delimited Text
Comma Delimited Text
.txt
Microsoft Excel
.xls
Formatted Text
.txt
Hypertext Markup Language
.htm
Lotus 123 (v 2.0)
.wk1
Plain Text
.txt
Plain Text with Line Breaks
.txt
Rich Text Format
.rtf
Tab Delimited Text
.txt
TypeReader Image
.tif
Microsoft Word
.doc
Corel WordPerfect (v 5.0)
.wpf
Corel WordPerfect (v 5.1)
.wpf
Convert an Image using Drag-and-Drop
To convert an image to text using drag and drop:
1. In Sharpdesk, select the image you want to convert in the Sharpdesk work area.
2. Make sure the Output Zone bar appears by selecting the Output Zone command from the
2. Make sure the Output Zone bar appears by selecting the Output Zone command from the
View menu.
3. Drag and drop the image onto the Convert by OCR option on the Output Zone bar. A dialog
appears showing you the progress of the conversion.
OCR Processing Dialog
[If the “Show this dialog every time OCR is processed” is checked (see Properties discussion
below), the Convert by OCR Preference tab will be displayed first.] This dialog shows you the
name of the file being converted, the current conversion process (auto-rotating, de-skewing,
locating and recognizing), and the progress of the entire job. Once the conversion completes, the
dialog automatically closes and the finished text document appears in the appropriate application.
If you click Cancel, the ongoing conversion process is canceled.
below), the Convert by OCR Preference tab will be displayed first.] This dialog shows you the
name of the file being converted, the current conversion process (auto-rotating, de-skewing,
locating and recognizing), and the progress of the entire job. Once the conversion completes, the
dialog automatically closes and the finished text document appears in the appropriate application.
If you click Cancel, the ongoing conversion process is canceled.
72
Sharpdesk User’s Guide
Convert by OCR Properties
To OCR an image and specify the options you wish to apply:
1. Choose the Preferences command in the Tools menu.
The Convert By OCR Properties dialog displays.
Convert By OCR Properties Dialog
2. You can specify the following text conversion options. When finished, click OK to save
your options.
OPTION/SETTING DESCRIPTION
File Format
Select a file format that the conversion should use for its output.
Locate Method
This option controls the method that the text conversion engine uses to locate
text on a page. You can select from:
Normal (Multi-Column): This setting is suitable for most cases. It is to be used
on text with ordinary paragraphs, pages with mixed text and graphics, and on
multi-column pages of text such as in newspapers and magazines. This setting
is also good for pages that have tables.
Force Single Column: Use this setting when the page contains side-by-side
blocks of text that you want the text conversion engine to read from left to
right across the page. With this setting, the engine always locates text from the
left margin to the right margin of the page, regardless of the spacing between
groups of words.
Picture Processing
This setting determines whether to look for text within pictures. Your choices
are:
Sharpdesk User’s Guide
73
Text & Pictures: Use this setting when you want the text conversion engine to
attempt to locate text that resides in regions on the page that it views as
pictures.
Text Only: The text conversion engine does not attempt to locate text that
resides in regions on the page that it considers to be pictures.
Quality of
Input Image
This setting tells the text conversion engine about the quality of the image it is
about to process. Your choices are:
Letter: This is appropriate for most documents.
Dot Matrix: Use this setting when characters in the input document are
printed in a mono-spaced font and made up of dots that are not touching.
Degraded Quality: Check this item when the input text is severely degraded
or is hard to read. This will significantly slow down the text conversion
processing.
Miscellaneous
Custom Dictionary: Clicking this button invokes a standard File Find dialog
box letting you select a custom dictionary file (.DIC). Custom dictionary files
improve the recognition ability when processing uncommon words. Sharpdesk
comes with a simple custom dictionary, USER.DIC, that will be selected by
default. If you are converting a document that contains technical terms that
aren’t commonly used, you can place these terms in a your custom dictionary
separated by line breaks. Use Microsoft Notepad or other similar editors to
make changes to your custom dictionaries or create new ones.
Language: Lets you select a language setting for the text conversion engine.
Illegible Character: Lets you specify the character that the text conversion
engine outputs when it cannot read a character. The default is “~”.
Show this dialog box every time OCR is performed: Checking this item
forces this property dialog box to be displayed every time you perform OCR,
giving you an opportunity to change the settings.
Text Conversion Tips
If you do not get the results you want when you convert an image to text, you might check the
following items:
following items:
•
Check your scanning options. You might have the brightness or contrast levels set too high
or too low.
or too low.
•
Check your text conversion preferences.
•
Clean your scanner glass and scanner lid with a soft cloth.
•
Reposition the document on the scanner glass.
•
Change your Quality of Input Image settings. Sometimes an image can be converted by
using the Dot Matrix or Degraded Quality settings.
using the Dot Matrix or Degraded Quality settings.
•
Be sure you have sufficient disk space to hold the temporary files that the text conversion
engine creates.
engine creates.
74
Sharpdesk User’s Guide
Click on the first or last page to see other AR-NS2 (serv.man2) service manuals if exist.