Setting up your scanner with TextBridge Pro How to start the program Registering your software New features in TextBridge Pro 11 N T R O D U C T I O N What is optical character recognition TextBridge Pro’s OCR capabilities...
Page 4
Processing from other applications How to set up Direct OCR How to use Direct OCR How to use TextBridge Pro 11 with your PaperPort software Processing documents with Schedule OCR Defining the source of page images O N T E N T S...
Page 5
Input from image files Input from scanner Scanning with an ADF Scanning long documents without an ADF Describing the layout of the document Manual zoning Working with zones Zone properties Table grids in the image Using zone templates R O O F I N G A N D E D I T I N G Proofreading OCR results Checking recognized text against original User dictionaries...
Page 6
Testing TextBridge Pro Low memory problems Low disk space problems Supported file types File types for opening and saving images File types for saving recognition results OCR problems Text does not get recognized properly Problems with fax recognition System or performance problems during OCR 82 Uninstalling the software O N T E N T S...
Page 7
This User’s Guide This guide introduces you to using TextBridge Pro 11. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
Page 8
We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with TextBridge Pro 11. Please refer to the scanner’s own documentation as necessary. The following conventions are used in this guide: Bold Introduces new terms and presents sub-headings.
Page 9
ETTING ONLINE HELP In addition to using this guide, you can use TextBridge Pro’s online Help to learn about features, settings, and procedures. Online Help is available after you install TextBridge Pro. Online HTML Help Open TextBridge Pro’s online Help at its top level by choosing TextBridge Pro Help Topics at the top of the Help menu.
Page 10
Tech Notes ScanSoft’s web site at www.scansoft.com contains Tech Notes on commonly reported issues using TextBridge Pro 11. Web pages may also offer assistance on the installation process and troubleshooting. Glossary This guide does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents.
System requirements Installing TextBridge Pro Setting up your scanner with TextBridge Pro How to start the program Registering your software New features in TextBridge Pro 11 ’ E X T R I D G E S E R U I D E...
YSTEM REQUIREMENTS You need the following minimum system requirements to install and run TextBridge Pro 11: A computer with a Pentium or higher processor Microsoft Windows 95, Windows 98, Windows MeMe, Windows 2000, or Windows NT 4.0 32MB of memory (RAM), 64MB recommended...
NSTALLING RIDGE TextBridge Pro 11’s installation program takes you through installation with instructions on every screen. Before installing TextBridge Pro: Make sure your scanner is connected, turned on, and compatible with your system. Close all other applications, especially anti-virus programs.
Page 14
RIDGE All files needed for scanner setup and support are copied automatically during the program’s installation. Before using TextBridge Pro 11 for scanning, your scanner should be correctly installed and tested for correct functionality. Scanner installation and setup are done through the Scanner Wizard.
Page 15
Scanner Wizard: Start É Programs É ScanSoft TextBridge Pro 11.0 É Scanner Wizard or Start É Programs É ScanSoft TextBridge Pro 11.0 É TextBridge Pro 11.0 É...
Use TextBridge Pro 11 with ScanSoft’s PaperPort or Pagis document management products, to add OCR services. See How to use TextBridge Pro 11 with your PaperPort software in chapter 3. EGISTERING YOUR SOFTWARE ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes.
Page 17
The family of TextBridge products moves to a new level with the introduction of TextBridge Pro 11. Here are the main areas of innovation and difference compared to the TextBridge Pro Millennium release: Increased accuracy – multiple recognition engines working in tandem offer significant gains in accuracy, especially on degraded documents.
Page 18
OCR Proofreader – Jump through the suspect words in a document and handle them one after the other in a dialog box. Previously, suspect words were only marked. See page 60. OCR verifier – This lets you compare any recognized word with its appearance in the original image as you edit and reformat the document.
This chapter introduces you to the solution: optical character recognition (OCR). It describes how TextBridge Pro 11 uses OCR technology to transform text from scanned pages or image files into editable text for use in your favorite computer applications.
(pixels) that together form character shapes. These present a picture of the text on a page. During OCR, TextBridge Pro 11 analyzes the character shapes in an image and defines solutions to produce editable text. After OCR, you can save the resulting text to a variety of word-processing, desktop publishing or spreadsheet applications.
Documents in TextBridge Pro TextBridge Pro 11 handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it.
RIDGE DESKTOP The TextBridge desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Original Image area and the Text Editor.
The TextBridge Toolbox lets you control processing. It can have three states, depending which of the three tab buttons on the left is clicked. In the picture, we display its appearance for Manual OCR. We show the program with a three-page document. Page one is the current page, which has been recognized and proofed.
The Image toolbar The Image toolbar contains buttons that allow you to zoom in or out on the current image or to rotate it. They also allow you work with zones and table dividers. See chapter 3, Manual zoning and Table grids in the image.
The TextBridge Toolbox This Toolbox lets you drive the processing. By default it is located along the top of the TextBridge desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop. It has three tabs on the left: AutoOCR™, Manual OCR and OCR Wizard.
ANAGING DOCUMENTS The Document Manager is situated on the left of the TextBridge desktop. It has two tabbed panels: Thumbnail view and Detail view. Click a tab to see its view. Both views summarize the pages in the document and are synchronized: the current and selected pages remain the same when you switch views.
Detail view This facility is new to TextBridge Pro 11. It provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. The picture below shows the default columns on the left and four columns which a user has specified.
Customizing columns in Detail view You can specify which columns of information you want to see in Detail view. Click Customize Details... in the View menu for the following dialog box: This item is highlighted. Highlight an item and use Click a checkbox these arrows to to select the item.
Closing a document Choose Close in the File menu to close a document. You are prompted to save your document if you have not saved it or you have modified it since the last save. See the next section on saving the document as a TextBridge Document (*.TXD).
You want to build up an archive of recognized documents whose original images remain accessible. The recognized texts allow searching by keywords and other document retrieval techniques. Note Recognition results should be saved from TXD files before installing any TextBridge Pro upgrade. These files may not be upwards compatible to newer TXD file formats, or possibly only the images will be retained when the files are upgraded.
Page 31
multi-page documents, with or without an Automatic Document Feeder (ADF). You can change scanner setup settings or install a new scanner or change the default scanner. Direct OCR This feature provides OCR services directly from your favorite word processor or similar application. Use this panel to register and unregister applications for Direct OCR and to enable or disable this service.
Page 32
Text Editor Use this to show or hide some features in the Text Editor, to define the unit of measurement to be used and to turn word wrapping on or off. Note Some settings have an effect only on future recognition. Examples are the recognition languages, a user dictionary and scanner brightness.
Tutorial: Processing documents This chapter describes different ways you can process a document and also provides information on key parts of this processing. Quick Start Guide Processing documents using the OCR Wizard Processing documents automatically Processing documents manually Processing a document automatically and finishing it manually Processing from other applications (Direct OCR, PaperPort) Processing documents with Schedule OCR The detailed topics are:...
You will process the document automatically and save the recognition results to a file. You will proof the document but will not edit it inside TextBridge Pro 11’s Text Editor. U T O R I A L R O C E S S I N G D O C U M E N T S...
Page 35
TextBridge Pro 11.0 Place the document correctly in your scanner. Check the three tab buttons to the left of the Specifies that you want TextBridge Pro 11 to process the TextBridge Toolbox. The AutoOCR button document automatically according to the given settings.
Page 36
Here is an overview of the processing methods you can use. You will find step-by-step guidance for each of them in the following pages. Using the OCR Wizard The OCR Wizard guides you through the selection of settings and commands by asking you questions. It then launches automatic processing. This is a good way to get started if you are new to TextBridge Pro.
OCR W ROCESSING DOCUMENTS USING THE IZARD The OCR Wizard takes you through six settings panels, guiding you to make settings for your document and then launching automatic processing. Context-sensitive help is available for all Wizard panels. The OCR Wizard can run only when there is no document open in TextBridge Pro.
Page 38
3. The third panel (shown below) lets you define recognition languages and decide OCR method. Languages with dictionary support have the icon 4. The fourth panel lets you define the formatting level to be applied to your document for display and export. See chapter 4, The editor display and views, for more information.
Page 39
manually or change other settings and then use manual processing to rerecognize single pages from the document. You can add pages with automatic or manual processing. Note The Wizard panels present settings as they were last set in the program. Also, TextBridge Pro will remember the settings you make in the OCR Wizard panels and apply them to future automatic or manual processing, until you change them.
ROCESSING DOCUMENTS AUTOMATICALLY Automatic processing provides an efficient way of handling documents, especially larger ones. First you select all settings needed, then you can use the AutoOCR™ toolbar in the TextBridge Toolbox to process a new document from start to finish or to restart and finish processing on an open document.
6. Click Start or choose Start in the Process menu. Each page of the document is processed and finished one after the other. The program may perform tasks simultaneously, for instance it may start loading and recognizing a new page as you proofread the previous page. Command buttons Start: This lets you begin automatic processing on a new document.
ROCESSING DOCUMENTS MANUALLY Manual processing gives you more precise control over the way your pages are handled. You can process the document page-by-page with different settings for each page. The program also stops between each step: acquiring images, performing recognition, exporting. This lets you, for instance, draw zones manually on each page.
Page 43
6. Select a value for the Perform OCR button. You describe the layout of the incoming pages. This value has an influence if auto-zoning runs on any pages. You can also select a template to have its zones placed on the current page. For more detail see the sections Describing the layout of the document and Using zone templates.
ROCESSING A DOCUMENT AUTOMATICALLY AND FINISHING IT MANUALLY When you have a large document with only a few pages needing special attention, you do not have to manually process the whole document. You can process it automatically and view results in the Text Editor. You can determine which pages are in order, and which need different settings or some manual zoning.
ROCESSING FROM OTHER APPLICATIONS You can use the Direct OCR feature to call on the recognition services of TextBridge Pro while you work in your usual word-processor or other application. First you must establish the direct connection with the application. Then, two items in its File Menu open the door to OCR facilities.
Preferences and then selecting TextBridge Pro 11 as the OCR package. OCR settings can be specified, as with Direct OCR. Here TextBridge Pro 11 has been selected as the OCR package for MS Word 2000. Then you can drag page images from the PaperPort desktop onto the MS Word link on the PaperPort.
Page 47
ADF. Here is how to set up a job: 1. Click Schedule OCR in the Process menu or in the Windows Start menu: select Programs É ScanSoft É TextBridge Pro 11.0 É Schedule OCR.
EFINING THE SOURCE OF PAGE IMAGES There are two possible image sources: from image files and from a scanner. There are two main types of scanners: flatbed or sheetfed. A scanner may have a built-in or added Automatic Document Feeder (ADF), which makes it easier to scan multi-page documents.
Normally the Add button places each file at the bottom of the file list. To place a file at a different location, highlight a file in the list. The new file will be added immediately below the lowest highlighted file. Input from scanner You must have a functioning, supported scanner correctly installed with TextBridge Pro.
Brightness and contrast Good brightness and contrast settings play an important role in OCR accuracy. Set these in the Scanner panel of the Options dialog box. The diagram illustrates an optimum brightness setting. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness.
You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically. For non-duplex scanners, select ‘Scan double-sided pages’ in the Scanner panel of the Options dialog box. Then you can scan the document in just a few passes, with even pages grouped together and odd pages also grouped.
Page 52
Single column, no table Choose this setting if your pages contain only one column of text and no table. Business letters or pages from a book are normally like this. Choose it also for a page with words or numbers arranged in columns if you do not want these placed in a table or decolumnized or treated as separate columns.
ANUAL ZONING Zones define areas on the page to be processed. Zones are rectangular or irregular (with sides formed by vertical and horizontal lines). Zones cannot overlap. They have a zone number in the top left corner and a zone type icon top right. Click in a zone to select it. Use Shift+clicks for a multiple selection.
Subtract from zone Click this to subtract irregular parts from an existing zone or split a zone into smaller ones. You cannot move or resize existing zones when this tool is active. You cannot use this with a table type zone. Reorder zones Click this for the zone reordering tool.
Page 55
Table zone Use this to have the zone contents treated as a table. Table grids can be automatically detected, or placed manually as described in the next section. Table zones must be rectangular. The Text Editor displays the table in an editable grid. You can choose whether to export tables in grids or in columns separated by tabs.
ABLE GRIDS IN THE IMAGE After automatic processing you may see table zones placed on a page. They are denoted with a table zone icon in the top right corner of the zone. To change a zone to or from a table zone, use its shortcut menu. You can also draw a table type zone.
Remove/replace all dividers Click this tool and click inside a table zone. Its dividers will all disappear. Click again to have dividers automatically (re)detected. Divider placement usually occurs during recognition; clicking twice with this tool lets you see and edit the dividers before recognition. SING ZONE TEMPLATES A template is a set of zones, their properties and reading order, stored in a file.
Page 58
How to unload a template Select a non-template setting for layout description in the Perform OCR drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing.
Page 59
Recognition results are placed in the Text Editor. This newly developed WYSIWYG (What You See Is What You Get) editor offers the following features, detailed in this chapter: Proofreading OCR results Checking recognized text against original (Verifying text) User dictionaries The editor display and views Text and image editing Page outline...
Page 60
The Text Editor offers four views for displaying its pages. You can switch freely from one view to another. These provide different levels of formatting. The views are: No Formatting view This displays plain decolumnized text in a single font and font size. Retain Fonts and Paragraphs view This displays decolumnized text with font and paragraph styling.
Page 61
This is what TextBridge Pro thought the This tells why word was. the word is suspected. The image of the suspect word is This window shows highlighted. the relevant part of the original image. Click inside it to enlarge or Drag a corner reduce the display.
HECKING RECOGNIZED TEXT AGAINST ORIGINAL After performing OCR, you can compare any part of the recognized text against the corresponding part of the original image, to verify that the text was recognized correctly. Work as follows: 1. Double-click any word in the Text Editor or select a word and choose Verify Text in the Tools menu.
SER DICTIONARIES The program has built-in dictionaries for many languages. These assist during recognition and may offer suggestions during proofing. They can be supplemented by user dictionaries. You can save any number of user dictionaries, but only one can be loaded at a time. Your user dictionaries from Microsoft Word are also available;...
measurement for the program and a word wrap setting for use in all Text Editor views except No Formatting view. Here are the main differences between the views: No Formatting view This displays plain decolumnized left-aligned text in a single font and font size, with the same line breaks as in the original document.
Page 65
Formatting toolbar or the Font dialog box from the Format menu. The latter also offers subscripts, superscripts and colored text or backgrounds. In No Formatting view you can use the Formatting toolbar to specify one font type and size to be applied to the whole document. This is not transferred to other views;...
of text in table cells with the alignment buttons in the Formatting toolbar and the tab controls in the ruler. When saving the document to file, you can choose whether to have the tables exported in grids or as tab separated columns.
5 Saving and exporting Once you have acquired at least one image for a document, you can export the image(s) to file. Once you have recognized at least one page, you can export recognition results to a target application by: 1.
REPARING RECOGNITION RESULTS FOR EXPORT Text is exported to file, Clipboard or mail with the formatting level defined by the view set in the Text Editor at export time, if that is possible. However, some export file types and target applications cannot support all formatting elements.
AVING TO FILE You can save recognized pages and original images to disk in a wide variety of file types. For tables of file types, see chapter 6, File types for opening and saving images and File types for saving recognition results. Saving original images 1.
Saving recognition results 1. Choose Save As... in the File menu, or click the Export Results button in the Manual OCR toolbar with Save as File selected in the drop-down list. 2. The Save As dialog box appears, as shown in its expanded form. Click Advanced to open the lower panel and Basic...
Note Graphics and formatting are saved in the document only if the selected file type supports them. The formatting level for export is the Editor view set at saving time. You will be warned if the formatting level is not supported by the export file type. Note If more than one export file is created, TextBridge Pro will append a numerical suffix to your file name to create unique file names.
Page 72
If you first save the document as a TextBridge Document (for instance as ), then modify it and later save it to a text file (for instance as memo.txd ), then modify it again and click Save, the recent changes are memo.txt saved to the file, not to the TXD.
ENDING A DOCUMENT AS A MAIL ATTACHMENT You can send recognition results as one or more files attached to a mail message if you have installed a MAPI-compliant mail application, such as Microsoft Outlook. t To send a document by e-mail •...
Page 74
3. Your mail application appears with the attachment(s) in a new empty message. Attachments take the name used for the last save of the document in TextBridge Pro, or ‘Untitled from TextBridge’. The suitable file extension is added, and numerical suffixes for multiple attachments.
6 Technical information This chapter provides troubleshooting and other technical information about using TextBridge Pro 11. Please also read the online Readme file and other help topics, or visit the ScanSoft web pages. The Scanner Information web page contains detailed and regularly updated information about scanner setup and support.
Visit the support section of ScanSoft’s web site at www.scansoft.com. It contains Tech Notes on commonly reported issues using TextBridge Pro 11. Our web pages may also offer assistance on the installation process and troubleshooting. Turn off your computer and your scanner, turn your scanner back on, and then restart your computer.
Testing TextBridge Pro Restarting Windows 95, 98, 2000 or Me in safe mode or Windows NT in VGA mode allows you to test TextBridge Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if TextBridge Pro has stopped running altogether.
5. Launch TextBridge Pro and try performing OCR on an image. Use a known image file such as one of the supplied sample files. Note You can also run TextBridge Pro 11 from a command line in its own safe mode. Choose Start É Run, browse for the file TextBridge.exe and add the command line option .
Remove Windows applications that you do not use. Defragment your hard disk. See Windows online Help for instructions. Clear the cache for your web browser and limit its size. UPPORTED FILE TYPES The program supports a wide range of file types, as detailed below. File types for opening and saving images Multi- B/W, Grayscale,...
PROBLEMS This section contains information and solutions for possible OCR problems. First we provide suggestions for improving recognition accuracy, second on getting good results from fax input and finally on system or performance problems arising during OCR. Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: Look at the original page image and ensure that all text areas are...
Check the glass, mirrors, and lenses on your scanner for dust, smudges, or scratches. Clean if necessary. Note TextBridge Pro only recognizes machine printed-text characters such as typewritten or laser-printed text. It can handle dot-matrix characters, though accuracy may be lower on draft-quality texts. It cannot read handprint or handwriting.
Restart Windows 95, 98 and Me and 2000 in safe mode, or Windows NT and in VGA mode and test TextBridge Pro by performing OCR on the included sample image files . See the section Testing TextBridge Pro. If you are performing multiple tasks at once, such as recognizing and printing, OCR may take longer.
Page 87
Low disk space problems, 78 checking OCR results, 62 navigation, 22 Low memory problems, 78 definition, 20 new file on blank page, 48 Direct OCR, 31, 45 outline, 66 jobs in Schedule OCR, 47 proofed, 26 method, 31, 38 recognized, 26 Mail performing OCR, 21 reordering, 26...
Page 88
documents in future sessions, saving results, 70 original images, 69, 79 speeding up, 82 recognition results, 70 documents manually, 42 Recognized page, 26 Save and Launch, 70 from other applications, 45 Rectangular zones, 53 text, 70 incomplete automatic Registering to file, 38, 69 processing, 41 applications for Direct OCR, to TXD format, 30...
Page 89
Speed maximised, 31 Templates, zone, 52, 57, 81 Unloading a user dictionary, 63 Splitting zones, 54 Testing TextBridge Pro, 77 Unloading a zone template, 57 Spreadsheet pages, 52 Text User dictionaries, 61, 63 Standard toolbar, 22, 23 Acquire Text Settings, 45 adding words, 61 Starting a user dictionary, 63 ASCII output, 80...
Need help?
Do you have a question about the TEXTBRIDGE PRO 11 and is the answer not in the manual?
Questions and answers