Content area
Xerox Imaging Systems Inc.'s TextBridge 1.0 and CTA Inc.'s TextPert 4.0 for Windows are examined. In its first release, TextBridge supports only a few desktop scanners, has no editing features, and possesses only a rudimentary capability to mark and process zones. TextPert's forte lies in handling such uniform documents as insurance forms, phone books, and spreadsheets.
We took an informal look at two other optical character recognition (OCR) applications, neither of which at press time was capable of standing up to a full review.
TEXTBRIDGE 1.0. TextBridge 1.0, from Xerox Imaging Systems Inc., keeps things simple with a streamlined interface and a $99 price tag. The catch is that in this release, TextBridge supports only a handful of desktop scanners, has no editing features, and possesses only a rudimentary capability to mark and process zones.
TextBridge functions both as a stand-alone application and as a Dynamic Data Exchange (DDE) utility for other applications. That said, the program currently includes just one premade DDE macro--for use with Microsoft Word for Windows--which lets you perform scanning and OCR functions from within Word. When OCR is complete, recognized text appears directly in your Word document, ready for editing. Other products, such as Caere Corp.'s OmniPage Direct, do essentially the same thing, but TextBridge costs much less.
From TextBridge's main dialog box you can decide upon the source of the image to OCR (either a file or scanner) and whether to preview the image before recognition. Between scans, TextBridge saves recognized text in a temporary file and appends the latest scan to this file, but the program offers no deferred processing or sophisticated file handling.
TextBridge preprocesses fax documents, smoothing and filling in characters. HP AccuPage automatically determines the correct brightness setting. Furthermore, the program can recognize text with up to a 5-degree rotation, or skew.
Because TextBridge has so few settings to understand, OCR is literally as simple as loading the text in the scanner and pressing the software's "GO!" button. In addition, the application uses several artificial intelligence techniques to "learn" about what it is recognizing. In theory, this should improve speed and accuracy as the program recognizes subsequent pages of a document. TextBridge, however, expects all pages to use the same fonts and point sizes. Because documents don't always follow these rules, in practice this feature loses some of its expediency.
Automatic page decomposition is fairly basic. You can treat a page as a single column or as multiple columns. But even with multicolumn input, the final text appears in single-column galley format. In page preview, you can examine the bit-mapped image at just two magnification levels, which makes it difficult to draw a zone around specific text. There's another problem, too: You can manually create only one zone per page. In automatic multicolumn mode, you cannot reorder the zones, but the program did faithfully reproduce the sequence of the original text in the documents we tested.
For modest needs, TextBridge gives you one-button operation for going from paper to electronic text, and it converts recognized text into a variety of word processing, database, and spreadsheet formats. Xerox Information Systems, however, left out much of the page decomposition and large-job functionality you will find in such products as WordScan Plus.
Xerox Information Systems, headquartered in Peabody, Mass., sells TextBridge direct and offers a 30-day money-back guarantee. The company can be reached at (800) 248-6550 or (508) 977-2000; fax: (508) 977-2435.
TEXTPERT 4.0. The omnifront engine of CTA Inc.'s TextPert 4.0 for Windows recognizes text in any of 34 Indo-European languages, in mixed font styles and sizes (from 6 to 72 points). But this products forte lies in handling such uniform documents as insurance forms, phone books, and spreadsheets; for these applications you can design templates to pick out just the desired information.
TextPert offers few format retention options, however, and it sometimes balks when decomposing common documents such as memos. Another drawback is that you must process long documents with mixed formats page by page. (CTA says a new version of TextPert should be shipping by the time you read this that remedies most of these difficulties.)
TextPert 4.0 works from traditional Windows drop-down menus that group related functions. The Reading menu, for instance, lets you scan, read, or accomplish both in one step. In this mode, TextPert automatically defines the text areas and performs the recognition process. TextPert defaults to using AccuPage (when using scanners from Hewlett-Packard Co.), but you can turn AccuPage off by setting the contrast level to 1, which gives you brightness control.
On a positive note, we could save custom layouts and apply them to similar jobs. Personalization retains other settings besides zones, such as whether your documents include typeset, OCR, or dot-matrix printing.
We were especially disappointed that TextPert 4.0 for Windows sometimes misplaced words that were originally on the same line and occasionally linked text boxes in haphazard order--even on the most basic layouts. For complex pages, the recognized output was sometimes useless.
TextPert possesses unusual forms processing talents. We managed to mask out unnecessary parts of a form so the program would ignore them during recognition. We then employed TextPert's Erase Form command, which automatically ignores a form's layout lines when they intersect with text; this improves accuracy. It also means you don't have to have forms printed with special nonreproducing ink.
TextPert's advantages do not nearly offset its liabilities. Priced at $695, TextPert 4.0 falls well below the standards set by WordScan Plus and OmniPage Professional.
Contact CTA, located in New Haven, Conn., at (203) 786-5828; fax: (203) 786-5833.
Copyright InfoWorld Publications, Inc. Nov 1, 1993