
John Mancini, President of the Association for Information and Image Management (AIIM), asked for guest bloggers for his popular “8 things” series on his blog, Digital Landfill; I responded.
Here is an excerpt of 8 Things I Learned About OCR in the Small to Mid-Sized Organization.
Hailed as the way in which we can breath life into our static, paper documents, Optical Character Recognition (OCR) has made strides in the recent decades – becoming a staple module in just about every software package managing documents – From Nuance’s PaperPort to EMC’s Documentum.
OCR, itself, can mean various things. Wikipedia offers this definition: … the mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text (2008).
While many estimate the accuracy levels for OCR engines can reach 98 or 99 percent, it has been my experience this is very difficult to achieve in most commercially-available software suites for the small-to-medium businesses (SMB’s). Many variables can affect the accuracy levels of output, ranging from document condition to readability.
With so many variables in scanning paper based documents it is often not possible to gain high accuracy ratings on a small budget. Thus OCR can often be a challenge to implement in many SMB’s.
Read my entire article, 8 Things I Learned About OCR in the Small to Mid-Sized Organization, at Digital Landfill.


