fhs
Volume 8, Number 44 -- December 9, 2008

A Better Kind of OCR Promised by Brainware

Published: December 9, 2008

by Alex Woodie

The advent of optical character recognition (OCR) technology has done much to speed the handling of documents. Without a way to digitize the information in documents, the paper crush would threaten to overwhelm. But OCR and related workflow technologies aren't perfect, and often just shift the burden from manual paper shuffling to manual electronic document shuffling. A company called Brainware says it has way to lessen the e-document drudgery by truly automating the input and verification of any paper-based document, from the mailroom to the ERP system.

For decades, visions of the paperless office have danced like sugarplum fairies in the heads of farsighted business executives. Instead of business being based on transactions conducted on paper documents and all the limitations they bring, electronic documents would herald a new age of speed and accuracy in the creation and completion of transactions--the age of e-business.

While it is true that electronic documents and the Web have combined to reshape how business is done, much of the world still relies heavily on paper-based processes, and that isn't going to change any time soon. Think of the number of bills you pay online, and compare it to all the invoices you still receive in the mail. Similarly, many businesses have adopted standard EDI processing to eliminate paper processing, but they must make exceptions for partners that don't support EDI, or where small transaction volumes make EDI or other e-business transaction processing too expensive.

When companies need to process large volumes of paper-based documents, OCR is often employed to get the information off the paper. But once the data is in hand, the related content management and workflow applications don't always combine to drive efficiency into the system, explains Charlie Kaplan, vice president of marketing and product management at Ashburn, Virginia-based Brainware.

"The notion of using imaging and workflow technology as the means to automating the workflow is far short of what it should actually be," Kaplan says. "Now that I don't have paper, I route it around the organization, and it needs to be keyed and approved and validated and re-keyed and so on and so forth. So it's still a high touch process. We think of this as a workflow-assisted human process. You've traded one pain for another."

Brainware's solution to this problem is a product called Distiller that can do much of this routing and e-document shuffling behind the scenes. The software, which runs on Windows, takes the raw TIFF output from any OCR scanning engine, and automatically categorizes the document (based on its "neural network" and self-learning technology). After it has correctly categorized the document, it then extracts data from the relevant fields and sends it directly to the order entry or ERP system, thereby bypassing the content management or workflow system entirely.

And while Brainware would seem to be in the OCR business, it does not see itself as an OCR provider. "It causes a lot of confusion. We get called an OCR technology, but we try and call ourselves intelligent data capture," Kaplan says. "It's an important distinction, because the OCR just generates text. And in order to put that text in any context, so you know what the word is that you're looking at, you gotta do all this other stuff."

According to Brainware, customers adopting Distiller can expect the software to correctly process more than 90 percent of the documents that it's faced with. In other words, nine out of 10 documents are never touched as they proceed from the scanner to the ERP system. This can enable huge cost savings for companies large enough to have sought automation solutions to document processing in the first place.

The savings start in the mailroom. "Companies get all this mail, and somebody has to sit there and take it out of envelope, take out staples, make sure it's clean, and press the mail before it hits the scanner," Kaplan says. "Then there are companies that spend time applying barcodes and separator sheets, putting them into the appropriate piles, whether they're separating remittances or invoices or memos or statements.

"We say, let's skip all that. We'll do that automatically with the application. The way the system works is it actually learns from examples. So if I show it examples of invoices and remittances and claims and any other document type, the system learns what makes those documents similar to one another. This is where we use the neural network technology."

The same neural network technology developed by Brainware's German inventors allows Distiller to automatically categorize the various fields within the document, which leads to even more savings. "We have the logic in Distiller to extract all the line items, and while doing that it will do cross validation," such as checking amount totals, Kaplan says. "Believe it or not, sometimes you get invoiced for things incorrectly."

One large Brainware customer is an energy services giant Halliburton, which uses Distiller at its global shared service centers in Oklahoma and Dubai. According to Kaplan, Distiller processes more than 2 million invoices per year for Halliburton. These invoices are sent in multiple languages from 550,000 different vendors, but Distiller was able to distinguish a Halliburton invoice from only 31 different examples, he says. From there, Distiller was off and running, and today provides Halliburton with a 92 percent passthrough rate into its SAP system.

Distiller's neural network-based approach is superior in many ways to the template and keyword-based approaches of first-generation OCR and imaging systems, Kaplan says. But it may be impossible to ever achieve a system that delivers 100 percent accuracy.

"Often the biggest problem is just a poor quality scan," Kaplan says. "Companies like Halibuton get invoices from crazy places that are printed on almost the equivalent of tissue paper. Scanning technology is good, but there are certain types that are really hard to scan. There are plenty of OCR errors that you get. It could be as simple as the printer of the invoice needs to be cleaned, because you get a smudge. Or somebody put a stamp over the numbers."

Perfection may not be attainable, but you can still save millions of dollars for your company. Another Brainware customer, Alltell Wireless, may have set the record for quickest return on investment.

Before Alltell implemented Distiller, the company managed to record about a million dollars in savings for all of 2006 by paying its invoices early, which is not that much for a company of its size, Kaplan said. Just three or four months after installing Distiller in late 2007, the company had already realized $17 million in discounts. "I think they paid for their software in a couple of weeks or a month," Kaplan says.

Brainware has customers in all types of industries, including some AS/400 shops. Several customers use Distiller to input transaction data into JD Edwards ERP systems, including JohnsonDiversey and Old Dominion Freight Line. There is a good write-up of JohnsonDiversey's use of Distiller on Brainware's Web site at www.brainware.com/docs/JohnsonDiversey_IOMA.pdf.

The one similarity among Brainware customers is that they tend to be larger shops--mostly $1 billion in revenues and up--that do a lot of paper-based business. After all, a 50 percent efficiency boost for a company that dedicates one employee to opening mail and order entry will not do much for the bottom line. It will also slow payback on a system that starts at about $500,000, installed. But for companies with big operations, Distiller can mean big savings.


RELATED STORY

Brainware Teams with Fujitsu on Document Capture Solution



                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
MKS

Are you using WDSC today? Moving to RDi tomorrow?

Would you like a more efficient way to work - a way to see all development tasks and change requests directly within your Eclipse-based development environment?

With MKS Integrity for IBM i, MKS offers the most advanced plug-in for WDSC and RDi available today. The plug-in brings requirements management, task management, software change and configuration management and the ability to deploy, directly to WDSC and RDi, helping developers be more productive and giving managers the process control and audit trail they are seeking to meet compliance and governance demands.

Developers can see tasks, update issues, run queries, check out code and deploy directly from within their IDE. All users get complete visibility of project requirements and changes as they occur. Stakeholders stay informed of project status throughout the software lifecycle ... and all of this from directly within WDSC and RDi!

If you are using WDSC or moving to RDi, let MKS demonstrate a superior way to do development - one that promotes productivity, efficiency and control.

Contact MKS today at 1-800-365-4406 or email info@mks.com.

Download a FREE White Paper:
From WDSC to RDi - Making Software Change Easier with MKS Integrity for IBM i


Editor: Alex Woodie
Contributing Editors: Dan Burger, Joe Hertvik,
Shannon O'Donnell, Timothy Prickett Morgan
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

VAULT400:  Never lose your data with VAULT400's online backup
Computer Keyes:  KeyesOverlay rapidly converts standard *SCS printer files into PDF documents
COMMON:  Join us at the 2009 annual meeting and expo, April 26-30, Reno, Nevada


 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
Getting Started with PHP for i5/OS: List Price, $59.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket Developers' Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
iSeries Express Web Implementer's Guide: List Price, $59.00
Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
WebFacing Application Design and Development Guide: List Price, $55.00
Can the AS/400 Survive IBM?: List Price, $49.00
The All-Everything Machine: List Price, $29.95
Chip Wars: List Price, $29.95


 
The Four Hundred
Soltis Exiting IBM, But He's Not Leaving the '400

A Little More Detail on the Smart Cube and Its Market

IBM's Academic Initiative Partners with DeVry University

Mad Dog 21/21: Potlatch Season

Server Sales Decline in the Third Quarter

The Linux Beacon
Why Blade Servers Still Don't Cut It, and How They Might

Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium

Microsoft Ponies Up Another $100 Million for Novell Linux

Mad Dog 21/21: Newtonian Economics

Two More Xeon-Based Galaxy Servers from Sun

Big Iron
For Some Customers, the Mainframe Is Green

Top Mainframe Stories From Around the Web

Chats, Webinars, Seminars, Shows, and Other Happenings

Four Hundred Guru
There's Power in Edit Words

SQL and Conversion Strategies

Admin Alert: Tuning i5/OS Storage Pools for Performance

System i PTF Guide
December 6, 2008: Volume 10, Number 49

November 29, 2008: Volume 10, Number 48

November 22, 2008: Volume 10, Number 47

November 15, 2008: Volume 10, Number 46

November 8, 2008: Volume 10, Number 45

November 1, 2008: Volume 10, Number 44

The Windows Observer
Citrix Addresses Performance with XenApp 5

Server Buyers Shop Like It's 1999 in the Second Quarter

Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium

Mad Dog 21/21: Newtonian Economics

Microsoft Does Something About Those SQL Injection Attacks

The Unix Guardian
What the Heck Is the Midrange, Anyway?

Overseas and Notebook Sales Offset Printer Declines for HP in Q3

Two More Xeon-Based Galaxy Servers from Sun

Mad Dog 21/21: Newtonian Economics

Intel's Nehalems to Star at IDF, AMD Pitches Shanghai

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

THIS ISSUE SPONSORED BY:

MKS
Help/Systems
Profound Logic Software
SkyView Partners
Minnesota Computers Corporation


Printer Friendly Version


TABLE OF CONTENTS
A Better Kind of OCR Promised by Brainware

Bug Busters Adds Remote Journaling to HA Offering

PARADE Magazine Turns a Page with ASNA's AVR and DataGate

Magic Updates RIA Framework with .NET Client

Infor Revives Infinium Brand for Casino Business

News Briefs and Product Shorts:

Mobile Banking Application Proving Very Popular, Jack Henry Says . . . Innovative Upgrades Trucking Software for i 6.1, New Tax Laws . . . DataDirect Supports i 6.1 with .NET Data Provider . . . Indonesian Bank Picks MIMIX for HA . . . Zephyr Targets Client Access with Replacement Program . . .

Four Hundred Stuff

BACK ISSUES





 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement